Utilizing Your Own Data with MacroTrace
Overview
While MacroTrace provides built-in support for various economic data sources like FRED and ONS, you can also analyze your own datasets using MacroTrace's from_dataframe() method. This allows you to leverage MacroTrace's revision analysis tools on custom data.
Required Data Structure
To use your own data with MacroTrace, you need to prepare a pandas DataFrame with the following three required columns:
| Column | Type | Description |
|---|---|---|
timestamp |
datetime | The observation date/time for each data point |
value |
numeric | The observed value at that timestamp |
release_date |
datetime | The date when this observation was released/published |
Creating a Time Series from DataFrame
import pandas as pd
from macrotrace import MTTimeSeries
# Example: Create a DataFrame with your data
df = pd.DataFrame({
'timestamp': ['2024-01-01', '2024-02-01', '2024-03-01', '2024-01-01', '2024-02-01'],
'value': [100.0, 105.0, 110.0, 99.5, 104.8],
'release_date': ['2024-01-15', '2024-02-15', '2024-03-15', '2024-02-15', '2024-03-15']
})
# Create MTTimeSeries from the DataFrame
ts = MTTimeSeries.from_dataframe(
df=df,
dataset_id='MY_CUSTOM_SERIES',
title='My Custom Economic Indicator',
units='Index (Base=100)',
frequency='MS',
seasonal_adjustment='Not Seasonally Adjusted'
)
Optional Parameters
When creating a time series from a DataFrame, you can provide optional metadata:
title: A descriptive title for your series (defaults todataset_id)units: The units of measurement (defaults to "Units")frequency: The frequency of observations denoted by a valid pandas offset alias (e.g., "MS", "QS", "AS"). If not provided, it will be inferred from the timestampsseasonal_adjustment: Description of seasonal adjustment method applied, if any
Including this metadata helps ensure clarity in visualizations.
Timezone Handling
Important: Both timestamp and release_date columns should be timezone-aware datetime objects. If your data lacks timezone information:
- MacroTrace will automatically assume UTC timezone
- You'll see a warning message in the logs
- To avoid this, explicitly set timezones when creating your DataFrame:
import pandas as pd
df['timestamp'] = pd.to_datetime(df['timestamp']).dt.tz_localize('UTC')
df['release_date'] = pd.to_datetime(df['release_date']).dt.tz_localize('UTC')
Or for a specific timezone:
Using Your Custom Time Series
Once created, your custom time series has access to all MacroTrace analysis methods:
# Access historical vintages
vintage_2024_02 = ts.as_of('2024-02-15')
# Generate vintage matrix
vintage_matrix = ts.generate_vintage_matrix()
# Plot revisions
ts.plot.revision_histogram()
# Compare vintages
ts.plot.timeseries_comparison(
vintage_dates=['2024-01-15', '2024-02-15', '2024-03-15'],
chart_type='line'
)
# Assess revision success
flags, success_rate = ts.assess_revision_success()
print(f"Revision success rate: {success_rate:.2%}")
Example: Loading from CSV
import pandas as pd
from macrotrace import MTTimeSeries
# Load your data from CSV
df = pd.read_csv('my_data.csv')
# Ensure proper data types
df['timestamp'] = pd.to_datetime(df['timestamp']).dt.tz_localize('UTC')
df['release_date'] = pd.to_datetime(df['release_date']).dt.tz_localize('UTC')
df['value'] = pd.to_numeric(df['value'])
# Create time series
ts = MTTimeSeries.from_dataframe(
df=df,
dataset_id='CUSTOM_GDP',
title='Custom GDP Estimates',
units='Billions of Dollars',
frequency='Quarterly',
seasonal_adjustment='Seasonally Adjusted at Annual Rate'
)
# Analyze
print(ts)
Troubleshooting
ValueError: DataFrame must contain columns
- Ensure your DataFrame has exactly these column names:
timestamp,value, andrelease_date. Column names are case-sensitive.
Timezone Warnings
- Add timezone information to your datetime columns using
pd.to_datetime().dt.tz_localize()as shown above. The program will default to UTC if no timezone is provided, but it's best to be explicit.
ValueError: Not enough observations to infer frequency
- Provide the
frequencyparameter explicitly when callingfrom_dataframe(), or ensure you have at least 2 observations with regular intervals.
Empty Vintage Chain
- Ensure your
release_datevalues vary. If all observations have the same release date, only one vintage will be created.