No Result
View All Result
SUBMIT YOUR ARTICLES
  • Login
Sunday, September 21, 2025
TheAdviserMagazine.com
  • Home
  • Financial Planning
    • Financial Planning
    • Personal Finance
  • Market Research
    • Business
    • Investing
    • Money
    • Economy
    • Markets
    • Stocks
    • Trading
  • 401k Plans
  • College
  • IRS & Taxes
  • Estate Plans
  • Social Security
  • Medicare
  • Legal
  • Home
  • Financial Planning
    • Financial Planning
    • Personal Finance
  • Market Research
    • Business
    • Investing
    • Money
    • Economy
    • Markets
    • Stocks
    • Trading
  • 401k Plans
  • College
  • IRS & Taxes
  • Estate Plans
  • Social Security
  • Medicare
  • Legal
No Result
View All Result
TheAdviserMagazine.com
No Result
View All Result
Home Market Research Investing

How GenAI-Powered Synthetic Data Is Reshaping Investment Workflows

by TheAdviserMagazine
2 months ago
in Investing
Reading Time: 5 mins read
A A
How GenAI-Powered Synthetic Data Is Reshaping Investment Workflows
Share on FacebookShare on TwitterShare on LInkedIn


In today’s data-driven investment environment, the quality, availability, and specificity of data can make or break a strategy. Yet investment professionals routinely face limitations: historical datasets may not capture emerging risks, alternative data is often incomplete or prohibitively expensive, and open-source models and datasets are skewed toward major markets and English-language content.

As firms seek more adaptable and forward-looking tools, synthetic data — particularly  when derived from generative AI (GenAI) — is emerging as a strategic asset, offering new ways to simulate market scenarios, train machine learning models, and backtest investing strategies. This post explores how GenAI-powered synthetic data is reshaping investment workflows — from simulating asset correlations to enhancing sentiment models — and what practitioners need to know to evaluate its utility and limitations.

What exactly is synthetic data, how is it generated by GenAI models, and why is it increasingly relevant for investment use cases?

Consider two common challenges. A portfolio manager looking to optimize performance across varying market regimes is constrained by historical data, which can’t account for “what-if” scenarios that have yet to occur. Similarly, a data scientist monitoring sentiment in German-language news for small-cap stocks may find that most available datasets are in English and focused on large-cap companies, limiting both coverage and relevance. In both cases, synthetic data offers a practical solution.

What Sets GenAI Synthetic Data Apart—and Why It Matters Now

Synthetic data refers to artificially generated datasets that replicate the statistical properties of real-world data. While the concept is not new — techniques like Monte Carlo simulation and bootstrapping have long supported financial analysis — what’s changed is the how.

GenAI refers to a class of deep-learning models capable of generating high-fidelity synthetic data across modalities such as text, tabular, image, and time-series. Unlike traditional methods, GenAI models learn complex real-world distributions directly from data, eliminating the need for rigid assumptions about the underlying generative process. This capability opens up powerful use cases in investment management, especially in areas where real data is scarce, complex, incomplete, or constrained by cost, language, or regulation.

Common GenAI Models

There are different types of GenAI models. Variational autoencoders (VAEs), generative adversarial networks (GANs), diffusion-based models, and large language models (LLMs) are the most common. Each model is built using neural network architectures, though they differ in their size and complexity. These methods have already demonstrated potential to enhance certain data-centric workflows within the industry. For example, VAEs have been used to create synthetic volatility surfaces to improve options trading (Bergeron et al., 2021). GANs have proven useful for portfolio optimization and risk management (Zhu, Mariani and Li, 2020; Cont et al., 2023). Diffusion-based models have proven useful for simulating asset return correlation matrices under various market regimes (Kubiak et al., 2024). And LLMs have proven useful for market simulations (Li et al., 2024).

Table 1.  Approaches to synthetic data generation.

MethodTypes of data it generatesExample applicationsGenerative?Monte CarloTime-seriesPortfolio optimization, risk managementNoCopula-based functionsTime-series, tabularCredit risk analysis, asset correlation modelingNoAutoregressive modelsTime-seriesVolatility forecasting, asset return simulationNoBootstrappingTime-series, tabular, textualCreating confidence intervals, stress-testingNoVariational AutoencodersTabular, time-series, audio, imagesSimulating volatility surfacesYesGenerative Adversarial NetworksTabular, time-series, audio, images,Portfolio optimization, risk management, model trainingYesDiffusion modelsTabular, time-series, audio, images,Correlation modelling, portfolio optimizationYesLarge language modelsText, tabular, images, audioSentiment analysis, market simulationYes

Evaluating Synthetic Data Quality

Synthetic data should be realistic and match the statistical properties of your real data. Existing evaluation methods fall into two categories: quantitative and qualitative.

Qualitative approaches involve visualizing comparisons between real and synthetic datasets. Examples include visualizing distributions, comparing scatterplots between pairs of variables, time-series paths and correlation matrices. For example, a GAN model trained to simulate asset returns for estimating value-at-risk should successfully reproduce the heavy-tails of the distribution. A diffusion model trained to produce synthetic correlation matrices under different market regimes should adequately capture asset co-movements.

Quantitative approaches include statistical tests to compare distributions such as Kolmogorov-Smirnov, Population Stability Index and Jensen-Shannon divergence. These tests output statistics indicating the similarity between two distributions. For example, the Kolmogorov-Smirnov test outputs a p-value which, if lower than 0.05, suggests two distributions are significantly different. This can provide a more concrete measurement to the similarity between two distributions as opposed to visualizations.

Another approach involves “train-on-synthetic, test-on-real,” where a model is trained on synthetic data and tested on real data. The performance of this model can be compared to a model that is trained and tested on real data. If the synthetic data successfully replicates the properties of real data, the performance between the two models should be similar.

In Action: Enhancing Financial Sentiment Analysis with GenAI Synthetic Data

To put this into practice, I fine-tuned a small open-source LLM, Qwen3-0.6B, for financial sentiment analysis using a public dataset of finance-related headlines and social media content, known as FiQA-SA[1]. The dataset consists of 822 training examples, with most sentences classified as “Positive” or “Negative” sentiment.

I then used GPT-4o to generate 800 synthetic training examples. The synthetic dataset generated by GPT-4o was more diverse than the original training data, covering more companies and sentiment (Figure 1). Increasing the diversity of the training data provides the LLM with more examples from which to learn to identify sentiment from textual content, potentially improving model performance on unseen data.

Figure 1. Distribution of sentiment classes for both real (left), synthetic (right), and augmented training dataset (middle) consisting of real and synthetic data.

Table 2. Example sentences from the real and synthetic training datasets.

SentenceClassDataSlump in Weir leads FTSE down from record high.NegativeRealAstraZeneca wins FDA approval for key new lung cancer pill.PositiveRealShell and BG shareholders to vote on deal at end of January.NeutralRealTesla’s quarterly report shows an increase in vehicle deliveries by 15%.PositiveSyntheticPepsiCo is holding a press conference to address the recent product recall.NeutralSyntheticHome Depot’s CEO steps down abruptly amidst internal controversies.NegativeSynthetic

After fine-tuning a second model on a combination of real and synthetic data using the same training procedure, the F1-score increased by nearly 10 percentage points on the validation dataset (Table 3), with a final F1-score of 82.37% on the test dataset.

Table 3. Model performance on the FiQA-SA validation dataset.

ModelWeighted F1-ScoreModel 1 (Real)75.29%Model 2 (Real + Synthetic)85.17%

I found that increasing the proportion of synthetic data too much had a negative impact. There is a Goldilocks zone between too much and too little synthetic data for optimum results.

Not a Silver Bullet, But a Valuable Tool

Synthetic data is not a replacement for real data, but it is worth experimenting with. Choose a method, evaluate synthetic data quality, and conduct A/B testing in a sandboxed environment where you compare workflows with and without different proportions of synthetic data. You might be surprised at the findings.

You can view all the code and datasets on the RPC Labs GitHub repository and take a deeper dive into the LLM case study in the Research and Policy Center’s “Synthetic Data in Investment Management” research report.

[1] The dataset is available for download here: https://huggingface.co/datasets/TheFinAI/fiqa-sentiment-classification



Source link

Tags: dataGenAIPoweredInvestmentReshapingsyntheticWorkflows
ShareTweetShare
Previous Post

Bristol Myers Squibb (BMY) Earnings: 2Q25 Key Numbers

Next Post

Tax Provisions Explained for Investors

Related Posts

edit post
As Rhode Island’s “Taylor Swift Tax” on Vacation Homes Spreads, Here’s What Short-Term Rental Owners Need to Know

As Rhode Island’s “Taylor Swift Tax” on Vacation Homes Spreads, Here’s What Short-Term Rental Owners Need to Know

by TheAdviserMagazine
September 19, 2025
0

In This Article Short-term landlords might not be able to sh-sh-shake it off when it comes to the additional taxes...

edit post
Who Needs to Rate Lock and Refinance ASAP

Who Needs to Rate Lock and Refinance ASAP

by TheAdviserMagazine
September 19, 2025
0

The Federal Reserve has finally cut rates. Will mortgage rates follow? If you’ve been waiting to rate lock or refinance,...

edit post
“Rent Freeze” in New York Could Cripple Mom-and-Pop Landlords. Will It Catch on Nationwide?

“Rent Freeze” in New York Could Cripple Mom-and-Pop Landlords. Will It Catch on Nationwide?

by TheAdviserMagazine
September 18, 2025
0

In This Article The New York real estate community choked on their spreadsheets when mayoral candidate Zohran Mamdani proposed rent...

edit post
Monthly Dividend Stock In Focus: Mesa Royalty Trust

Monthly Dividend Stock In Focus: Mesa Royalty Trust

by TheAdviserMagazine
September 18, 2025
0

Published on September 18th, 2025 by Bob Ciura Monthly dividend stocks have instant appeal for many income investors. Stocks that...

edit post
Quarterly Earnings: Signal vs. Noise, Cost vs. Benefit

Quarterly Earnings: Signal vs. Noise, Cost vs. Benefit

by TheAdviserMagazine
September 18, 2025
0

With the White House downplaying the value of quarterly reporting for companies, investors face a familiar question: does the cost...

edit post
Fed Cuts Rates as Employment Softens, But Real Estate Recovery Remains Uncertain

Fed Cuts Rates as Employment Softens, But Real Estate Recovery Remains Uncertain

by TheAdviserMagazine
September 17, 2025
0

In This Article Following a weakening labor market, the Federal Reserve’s announcement that it will cut interest rates by 0.25%...

Next Post
edit post
Tax Provisions Explained for Investors

Tax Provisions Explained for Investors

edit post
Key highlights from AbbVie’s (ABBV) Q2 2025 earnings results

Key highlights from AbbVie’s (ABBV) Q2 2025 earnings results

  • Trending
  • Comments
  • Latest
edit post
What Happens If a Spouse Dies Without a Will in North Carolina?

What Happens If a Spouse Dies Without a Will in North Carolina?

September 14, 2025
edit post
California May Reimplement Mask Mandates

California May Reimplement Mask Mandates

September 5, 2025
edit post
Who Needs a Trust Instead of a Will in North Carolina?

Who Needs a Trust Instead of a Will in North Carolina?

September 1, 2025
edit post
Does a Will Need to Be Notarized in North Carolina?

Does a Will Need to Be Notarized in North Carolina?

September 8, 2025
edit post
DACA recipients no longer eligible for Marketplace health insurance and subsidies

DACA recipients no longer eligible for Marketplace health insurance and subsidies

September 11, 2025
edit post
Big Dave’s Cheesesteaks CEO grew up in ‘survival mode’ selling newspapers and bean pies—now his chain sells a  cheesesteak every 58 seconds

Big Dave’s Cheesesteaks CEO grew up in ‘survival mode’ selling newspapers and bean pies—now his chain sells a $12 cheesesteak every 58 seconds

August 30, 2025
edit post
FedEx Q1 2026 Earnings Call: Listen Live and Follow Along with the Real-Time Transcript

FedEx Q1 2026 Earnings Call: Listen Live and Follow Along with the Real-Time Transcript

0
edit post
The Fed doesn’t have a ‘dual’ mandate—Jerome Powell and Stephen Miran are talking about the third

The Fed doesn’t have a ‘dual’ mandate—Jerome Powell and Stephen Miran are talking about the third

0
edit post
The World According To Martin Armstrong – An Amazon Bestseller

The World According To Martin Armstrong – An Amazon Bestseller

0
edit post
Flora Growth Launches 1M Treasury to Back 0G AI Blockchain

Flora Growth Launches $401M Treasury to Back 0G AI Blockchain

0
edit post
9 Work-While-Claiming Rules That Reduce Your Check

9 Work-While-Claiming Rules That Reduce Your Check

0
edit post
Navan files prospectus for Nasdaq IPO

Navan files prospectus for Nasdaq IPO

0
edit post
The Fed doesn’t have a ‘dual’ mandate—Jerome Powell and Stephen Miran are talking about the third

The Fed doesn’t have a ‘dual’ mandate—Jerome Powell and Stephen Miran are talking about the third

September 21, 2025
edit post
Navan files prospectus for Nasdaq IPO

Navan files prospectus for Nasdaq IPO

September 21, 2025
edit post
Flora Growth Launches 1M Treasury to Back 0G AI Blockchain

Flora Growth Launches $401M Treasury to Back 0G AI Blockchain

September 21, 2025
edit post
The World According To Martin Armstrong – An Amazon Bestseller

The World According To Martin Armstrong – An Amazon Bestseller

September 21, 2025
edit post
H-1B visas: White House tries to clear confusion after panic throws corporate America into chaos

H-1B visas: White House tries to clear confusion after panic throws corporate America into chaos

September 20, 2025
edit post
Tech companies warn H-1B visa holders to avoid foreign travel

Tech companies warn H-1B visa holders to avoid foreign travel

September 20, 2025
The Adviser Magazine

The first and only national digital and print magazine that connects individuals, families, and businesses to Fee-Only financial advisers, accountants, attorneys and college guidance counselors.

CATEGORIES

  • 401k Plans
  • Business
  • College
  • Cryptocurrency
  • Economy
  • Estate Plans
  • Financial Planning
  • Investing
  • IRS & Taxes
  • Legal
  • Market Analysis
  • Markets
  • Medicare
  • Money
  • Personal Finance
  • Social Security
  • Startups
  • Stock Market
  • Trading

LATEST UPDATES

  • The Fed doesn’t have a ‘dual’ mandate—Jerome Powell and Stephen Miran are talking about the third
  • Navan files prospectus for Nasdaq IPO
  • Flora Growth Launches $401M Treasury to Back 0G AI Blockchain
  • Our Great Privacy Policy
  • Terms of Use, Legal Notices & Disclosures
  • Contact us
  • About Us

© Copyright 2024 All Rights Reserved
See articles for original source and related links to external sites.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Home
  • Financial Planning
    • Financial Planning
    • Personal Finance
  • Market Research
    • Business
    • Investing
    • Money
    • Economy
    • Markets
    • Stocks
    • Trading
  • 401k Plans
  • College
  • IRS & Taxes
  • Estate Plans
  • Social Security
  • Medicare
  • Legal

© Copyright 2024 All Rights Reserved
See articles for original source and related links to external sites.