No Result
View All Result
SUBMIT YOUR ARTICLES
  • Login
Tuesday, March 31, 2026
TheAdviserMagazine.com
  • Home
  • Financial Planning
    • Financial Planning
    • Personal Finance
  • Market Research
    • Business
    • Investing
    • Money
    • Economy
    • Markets
    • Stocks
    • Trading
  • 401k Plans
  • College
  • IRS & Taxes
  • Estate Plans
  • Social Security
  • Medicare
  • Legal
  • Home
  • Financial Planning
    • Financial Planning
    • Personal Finance
  • Market Research
    • Business
    • Investing
    • Money
    • Economy
    • Markets
    • Stocks
    • Trading
  • 401k Plans
  • College
  • IRS & Taxes
  • Estate Plans
  • Social Security
  • Medicare
  • Legal
No Result
View All Result
TheAdviserMagazine.com
No Result
View All Result
Home Market Research Investing

ML Models Need Better Training Data: The GenAI Solution

by TheAdviserMagazine
1 year ago
in Investing
Reading Time: 8 mins read
A A
ML Models Need Better Training Data: The GenAI Solution
Share on FacebookShare on TwitterShare on LInkedIn


Our understanding of financial markets is inherently constrained by historical experience — a single realized timeline among countless possibilities that could have unfolded. Each market cycle, geopolitical event, or policy decision represents just one manifestation of potential outcomes.

This limitation becomes particularly acute when training machine learning (ML) models, which can inadvertently learn from historical artifacts rather than underlying market dynamics. As complex ML models become more prevalent in investment management, their tendency to overfit to specific historical conditions poses a growing risk to investment outcomes.

Generative AI-based synthetic data (GenAI synthetic data) is emerging as a potential solution to this challenge. While GenAI has gained attention primarily for natural language processing, its ability to generate sophisticated synthetic data may prove even more valuable for quantitative investment processes. By creating data that effectively represents “parallel timelines,” this approach can be designed and engineered to provide richer training datasets that preserve crucial market relationships while exploring counterfactual scenarios.

The Challenge: Moving Beyond Single Timeline Training

Traditional quantitative models face an inherent limitation: they learn from a single historical sequence of events that led to the present conditions. This creates what we term “empirical bias.” The challenge becomes more pronounced with complex machine learning models whose capacity to learn intricate patterns makes them particularly vulnerable to overfitting on limited historical data. An alternative approach is to consider counterfactual scenarios: those that might have unfolded if certain, perhaps arbitrary events, decisions, or shocks had played out differently

To illustrate these concepts, consider active international equities portfolios benchmarked to MSCI EAFE. Figure 1 shows the performance characteristics of multiple portfolios — upside capture, downside capture, and overall relative returns — over the past five years ending January 31, 2025.

Figure 1: Empirical Data. EAFE-Benchmarked Portfolios, five-year performance characteristics to January 31, 2025.

This empirical dataset represents just a small sample of possible portfolios, and an even smaller sample of potential outcomes had events unfolded differently. Traditional approaches to expanding this dataset have significant limitations.

Figure 2.Instance-based approaches: K-nearest neighbors (left), SMOTE (right).

Traditional Synthetic Data: Understanding the Limitations

Conventional methods of synthetic data generation attempt to address data limitations but often fall short of capturing the complex dynamics of financial markets. Using our EAFE portfolio example, we can examine how different approaches perform:

Instance-based methods like K-NN and SMOTE extend existing data patterns through local sampling but remain fundamentally constrained by observed data relationships. They cannot generate scenarios much beyond their training examples, limiting their utility for understanding potential future market conditions. 

Figure 3: More flexible approaches generally improve outcomes but struggle to capture complex market relationships: GMM (left), KDE (right).

 

Traditional synthetic data generation approaches, whether through instance-based methods or density estimation, face fundamental limitations. While these approaches can extend patterns incrementally, they cannot generate realistic market scenarios that preserve complex inter-relationships while exploring genuinely different market conditions. This limitation becomes particularly clear when we examine density estimation approaches.

Density estimation approaches like GMM and KDE offer more flexibility in extending data patterns, but still struggle to capture the complex, interconnected dynamics of financial markets. These methods particularly falter during regime changes, when historical relationships may evolve.

GenAI Synthetic Data: More Powerful Training

Recent research at City St Georges and the University of Warwick, presented at the NYU ACM International Conference on AI in Finance (ICAIF), demonstrates how GenAI can potentially better approximate the underlying data generating function of markets. Through neural network architectures, this approach aims to learn conditional distributions while preserving persistent market relationships.

The Research and Policy Center (RPC) will soon publish a report that defines synthetic data and outlines generative AI approaches that can be used to create it. The report will highlight best methods for evaluating the quality of synthetic data and use references to existing academic literature to highlight potential use cases.

Figure 4: Illustration of GenAI synthetic data expanding the space of realistic possible outcomes while maintaining key relationships.

This approach to synthetic data generation can be expanded to offer several potential advantages:

Expanded Training Sets: Realistic augmentation of limited financial datasets

Scenario Exploration: Generation of plausible market conditions while maintaining persistent relationships

Tail Event Analysis: Creation of varied but realistic stress scenarios

As illustrated in Figure 4, GenAI synthetic data approaches aim to expand the space of possible portfolio performance characteristics while respecting fundamental market relationships and realistic bounds. This provides a richer training environment for machine learning models, potentially reducing their vulnerability to historical artifacts and improving their ability to generalize across market conditions.

Implementation in Security Selection

For equity selection models, which are particularly susceptible to learning spurious historical patterns, GenAI synthetic data offers three potential benefits:

Reduced Overfitting: By training on varied market conditions, models may better distinguish between persistent signals and temporary artifacts.

Enhanced Tail Risk Management: More diverse scenarios in training data could improve model robustness during market stress.

Better Generalization: Expanded training data that maintains realistic market relationships may help models adapt to changing conditions.

The implementation of effective GenAI synthetic data generation presents its own technical challenges, potentially exceeding the complexity of the investment models themselves. However, our research suggests that successfully addressing these challenges could significantly improve risk-adjusted returns through more robust model training.

fintool ad

The GenAI Path to Better Model Training

GenAI synthetic data has the potential to provide more powerful, forward-looking insights for investment and risk models. Through neural network-based architectures, it aims to better approximate the market’s data generating function, potentially enabling more accurate representation of future market conditions while preserving persistent inter-relationships.

While this could benefit most investment and risk models, a key reason it represents such an important innovation right now is owing to the increasing adoption of machine learning in investment management and the related risk of overfit. GenAI synthetic data can generate plausible market scenarios that preserve complex relationships while exploring different conditions. This technology offers a path to more robust investment models.

However, even the most advanced synthetic data cannot compensate for naïve machine learning implementations. There is no safe fix for excessive complexity, opaque models, or weak investment rationales.

The Research and Policy Center will host a webinar tomorrow, March 18, featuring Marcos López de Prado, a world-renowned expert in financial machine learning and quantitative research.

conversations with frank button



Source link

Tags: dataGenAIModelsSolutiontraining
ShareTweetShare
Previous Post

Do FBAR Penalties Die With the Taxpayer? – Houston Tax Attorneys

Next Post

The Alternative View: 401(k) Plans Are Better off Without Private Investments

Related Posts

edit post
What the “Forever Renter” Era Means For Landlords

What the “Forever Renter” Era Means For Landlords

by TheAdviserMagazine
March 31, 2026
0

In This Article It feels like every other headline you read about homeownership goes something like: “Is the American dream...

edit post
Shape Portfolio Losses Derivatives | EI Blog

Shape Portfolio Losses Derivatives | EI Blog

by TheAdviserMagazine
March 31, 2026
0

The collar restructures the cost problem. Own a stock at $100. Buy a $95 put for $2 and sell a...

edit post
This Could Be the Best Real Estate “Buy” of 2026

This Could Be the Best Real Estate “Buy” of 2026

by TheAdviserMagazine
March 31, 2026
0

Dave:Every time we start to think that the market is getting a little less confusing or a little more predictable,...

edit post
10 Recession Proof Stocks For Safe Dividends

10 Recession Proof Stocks For Safe Dividends

by TheAdviserMagazine
March 30, 2026
0

Published on March 30th, 2026 by Bob Ciura The S&P 500 Index performed well in 2025, but is down roughly...

edit post
Why Alts Command High Fees

Why Alts Command High Fees

by TheAdviserMagazine
March 30, 2026
0

Over the past three decades, fee compression has reshaped equities and fixed income, alongside the rise of transparent, low-cost mutual...

edit post
Making ,000/Month Cash Flow from One Rental Property (And Retiring in 4 Years)

Making $5,000/Month Cash Flow from One Rental Property (And Retiring in 4 Years)

by TheAdviserMagazine
March 30, 2026
0

Want to retire with rentals so you can buy back your time and travel the world? Despite a successful 35-year...

Next Post
edit post
The Alternative View: 401(k) Plans Are Better off Without Private Investments

The Alternative View: 401(k) Plans Are Better off Without Private Investments

edit post
Book Review: Buffett’s Early Investments

Book Review: Buffett’s Early Investments

  • Trending
  • Comments
  • Latest
edit post
Massachusetts loses billions in income after millionaire tax

Massachusetts loses billions in income after millionaire tax

March 24, 2026
edit post
Illinois’ Paid Leave for All Workers Act Takes Effect — Every Employee Now Gets Guaranteed Time Off

Illinois’ Paid Leave for All Workers Act Takes Effect — Every Employee Now Gets Guaranteed Time Off

March 27, 2026
edit post
Virginia Permits ADULT MIGRANT MEN To Attend High School

Virginia Permits ADULT MIGRANT MEN To Attend High School

March 30, 2026
edit post
A 58-year-old left NYC for Miami to save on taxes — then retired early thanks to hidden savings. Here’s the math

A 58-year-old left NYC for Miami to save on taxes — then retired early thanks to hidden savings. Here’s the math

March 30, 2026
edit post
Publix to Open 5 New Stores by End of April. See Upcoming Locations.

Publix to Open 5 New Stores by End of April. See Upcoming Locations.

March 20, 2026
edit post
Hospitals in This State Routinely Sue Patients Over Unpaid Bills

Hospitals in This State Routinely Sue Patients Over Unpaid Bills

March 27, 2026
edit post
Tesla Appears to Have a New Master Plan… and SpaceX Is a Huge Part of It

Tesla Appears to Have a New Master Plan… and SpaceX Is a Huge Part of It

0
edit post
Don’t Get Burned Trying To Save Money: The  Beauty Tool That Can Cause Chemical Burns

Don’t Get Burned Trying To Save Money: The $8 Beauty Tool That Can Cause Chemical Burns

0
edit post
What the “Forever Renter” Era Means For Landlords

What the “Forever Renter” Era Means For Landlords

0
edit post
Amazon Spring Sale Ends Tonight: Our Team’s Top Favorite Deals Still Available!

Amazon Spring Sale Ends Tonight: Our Team’s Top Favorite Deals Still Available!

0
edit post
Red tide in equity funds: Only a few stay afloat

Red tide in equity funds: Only a few stay afloat

0
edit post
Psychology says the reason walking away from disrespectful people feels like guilt instead of freedom is because you were raised in an environment where your comfort was never a valid reason to make someone else uncomfortable — and unlearning that equation is the hardest boundary work there is

Psychology says the reason walking away from disrespectful people feels like guilt instead of freedom is because you were raised in an environment where your comfort was never a valid reason to make someone else uncomfortable — and unlearning that equation is the hardest boundary work there is

0
edit post
Red tide in equity funds: Only a few stay afloat

Red tide in equity funds: Only a few stay afloat

March 31, 2026
edit post
Psychology says the reason walking away from disrespectful people feels like guilt instead of freedom is because you were raised in an environment where your comfort was never a valid reason to make someone else uncomfortable — and unlearning that equation is the hardest boundary work there is

Psychology says the reason walking away from disrespectful people feels like guilt instead of freedom is because you were raised in an environment where your comfort was never a valid reason to make someone else uncomfortable — and unlearning that equation is the hardest boundary work there is

March 31, 2026
edit post
Gen Restaurant Group targets 5M-5M 2026 revenue while projecting CPG run rate over 0M within 3 years (NASDAQ:GENK)

Gen Restaurant Group targets $215M-$225M 2026 revenue while projecting CPG run rate over $100M within 3 years (NASDAQ:GENK)

March 31, 2026
edit post
Ethereum Faces Selling Pressure On Charts While Supply Remains Locked

Ethereum Faces Selling Pressure On Charts While Supply Remains Locked

March 31, 2026
edit post
Trader Joe’s Brings Back Popular French Dessert for Limited Time

Trader Joe’s Brings Back Popular French Dessert for Limited Time

March 31, 2026
edit post
Don’t Get Burned Trying To Save Money: The  Beauty Tool That Can Cause Chemical Burns

Don’t Get Burned Trying To Save Money: The $8 Beauty Tool That Can Cause Chemical Burns

March 31, 2026
The Adviser Magazine

The first and only national digital and print magazine that connects individuals, families, and businesses to Fee-Only financial advisers, accountants, attorneys and college guidance counselors.

CATEGORIES

  • 401k Plans
  • Business
  • College
  • Cryptocurrency
  • Economy
  • Estate Plans
  • Financial Planning
  • Investing
  • IRS & Taxes
  • Legal
  • Market Analysis
  • Markets
  • Medicare
  • Money
  • Personal Finance
  • Social Security
  • Startups
  • Stock Market
  • Trading

LATEST UPDATES

  • Red tide in equity funds: Only a few stay afloat
  • Psychology says the reason walking away from disrespectful people feels like guilt instead of freedom is because you were raised in an environment where your comfort was never a valid reason to make someone else uncomfortable — and unlearning that equation is the hardest boundary work there is
  • Gen Restaurant Group targets $215M-$225M 2026 revenue while projecting CPG run rate over $100M within 3 years (NASDAQ:GENK)
  • Our Great Privacy Policy
  • Terms of Use, Legal Notices & Disclosures
  • Contact us
  • About Us

© Copyright 2024 All Rights Reserved
See articles for original source and related links to external sites.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Home
  • Financial Planning
    • Financial Planning
    • Personal Finance
  • Market Research
    • Business
    • Investing
    • Money
    • Economy
    • Markets
    • Stocks
    • Trading
  • 401k Plans
  • College
  • IRS & Taxes
  • Estate Plans
  • Social Security
  • Medicare
  • Legal

© Copyright 2024 All Rights Reserved
See articles for original source and related links to external sites.