Improving Time to Value for Data Engineers and Scientists.

Monitoring and debugging an entire AI data model is incredibly taxing. You not only have to be aware of what is currently breaking; you also have to conceptualize how prior choices and errors may be impacting the current iteration of the data model. This project’s goal was to substantially lower the barrier to success for users, and improve their time to value.

My Role
Project Lead / Solo Designer

Timeframe
8 weeks

Summary

Product

Machine Learning Data Management Software (MLDM) enables users to create, track, and monitor the flow of data used and produced by their AI models. Users navigate through information through a relational diagram and data tables.

Personas

Data Engineers & Scientists → Primarily use MLDM to debug their AI data models at every stage of development. Frustration with identifying root causes to problems is the driving pain point for this project.

Business Problems

Mounting frustration has lead to customer churn.

Design Problems

Users need to be able to rationalize two things. 1) where are the errors located in the stream of data that goes from the initial ingestion of it, and the final output from the AI model. 2) How have errors and choices made in the past affect the final output in the current iteration.

Outputs

  • 2 designed and shipped experiences revolving around an “error inbox” and a “time machine.”

  • 1 designed but unshipped experience refactoring how information is presenting information side panels.

Outcomes

Unfortunately, we do not collected user data due to the sensitive nature of our client's projects. However, a lot of anecdotal feedback was received from users and the support engineering team communicating their excitement for the new features — with the most important one being “I don’t have to do everything in the command line anymore!”

The Problem

The synthesized user research and feature requests pointed to user flow fatigue, and the inability to understand the impact of changes made over time.


Core User Needs

  • Fatigue → User are prioritizing information about what needs to be debugged. AI data models can have an upper range of 300 different data transformation steps.

  • Understanding changes → Changes to the transformation code can have disastrous consequences.

Translating User Needs Into Design Hypotheses

  • Fatigue → Surfacing failure statuses higher will reduce fatigue by simplifying how users fetch status data.

  • Understanding changes → A time machine will improve how user rationalize changes over time.

The solutions didn’t just come from me. They came from users, co-creation with the whole team, and connecting existing features in a way that creates a stronger sense of context.


Going From Design Hypotheses to Concrete Direction.

  • Organizing and consolidating information → An “inbox of doom” will eliminate the task of finding out where in the data model errors have occurred.

  • Understanding changes → A comparison experience for versions of the data model will create a contextual space for users to understand changes.

  • Understanding changes → Providing a shortcut to the corresponding data tables in the comparison experience will accommodate users that prefer a tabular format instead of a graphical format.

Quick Validation

To validate the design direction and prevent shipping an unwanted experience, I conducted a research survey with a clickable prototype that I sent out to 2 of our customers. The survey returned with 6 responses from users.

  • “Inbox of doom” → improved experience of fetching status information

  • Comparison experience → improved experience of understanding changes made over time.

Even with a low sample size, it was clear that the direction was solid. None of the survey respondents answered lower than a neutral rating on the proposed features on the Likert scale portion of the survey.

We shipped and are live!


Next Steps

  • Roadmap planning for the remaining feature that could not be shipped due to a lack of capacity for the quarter.

  • Follow up monitoring and evaluation with users on the efficacy of the new experiences and features.