My App

Tutorial: Analyzing LLM Trace Data

In this tutorial, we'll walk through the process of analyzing LLM trace data using Probably. We'll focus on answering the question: "Which model has the most stable latency distribution?" This analysis will demonstrate Probably's power in handling complex datasets and its ability to provide quick insights into multi-dimensional data.

Dataset Overview

For this tutorial, we'll use a dataset of LLM traces. Each row in our dataset represents a single interaction with an LLM and includes the following columns:

  • input_variables: Variables used in the prompt
  • rendered_prompt: The full prompt sent to the model
  • model: The specific LLM used (e.g., GPT-3.5-turbo, GPT-4, DALL-E)
  • sampling_params: Parameters used for text generation (e.g., temperature, top_p)
  • output: The generated text from the LLM
  • token_count: Number of tokens in the interaction
  • cost: The cost of the API call
  • latency: Time taken for the LLM to respond (in milliseconds)

Step 1: Loading the Dataset

  1. Download the LLM trace dataset from [link to dataset].
  2. Open Probably and click on "Add Dataset" in the top right corner.
  3. Upload the CSV file containing the LLM trace data.
  4. Review the auto-detected column types and adjust if necessary.
  5. Click "Confirm" to load the dataset.

Step 2: Initial Data Exploration

  1. Create a frequency plot of the model column to see the distribution of traces across different models.
  2. Create a box plot with model on the X-axis and latency on the Y-axis for an initial view of latency distributions.

Step 3: Analyzing Latency Distributions

  1. Create a histogram of latency for each model:
    • Set X-axis to latency
    • Set Y-axis to frequency
    • Use model as the Z-axis variable
  2. Observe the shape and spread of latency distributions for each model.

Step 4: Calculating Stability Metrics

  1. Use Probably's statistical functions to calculate the coefficient of variation (CV) for each model's latency:
    • CV = (Standard Deviation / Mean) * 100
    • A lower CV indicates a more stable distribution

Step 5: Visualizing Stability Metrics

  1. Create a bar plot with model on the X-axis and the calculated CV on the Y-axis.
  2. Sort the bars in ascending order of CV to easily identify the most stable model.

Step 6: Investigating Factors Affecting Latency

  1. Explore relationships between latency and other variables:
    • Create a scatter plot of token_count vs latency, colored by model
    • Create a box plot of sampling_params vs latency for the most stable model

Conclusion

Summarize the findings:

  • Identify the model with the most stable latency distribution
  • Discuss any patterns or insights discovered during the analysis
  • Suggest potential next steps or areas for further investigation

By following this tutorial, you've learned how to use Probably to analyze complex LLM trace data and answer specific questions about model performance. This process demonstrates how Probably's intuitive interface and powerful visualization capabilities can streamline your data analysis workflow and provide valuable insights quickly and easily.

On this page