TL;DR

  • 25+ data sources across seven categories: news, social sentiment, market data, options flow, government filings, crypto, and alternative data.
  • Free sources work out of the box with no API keys. These include RSS feeds, StockTwits, Yahoo Finance, SEC EDGAR, CoinGecko, and more.
  • API-key sources unlock deeper data — Reddit, Alpha Vantage, Finnhub, FRED, and others. Most offer free tiers generous enough for trading use.
  • Basic tier includes all free sources. Max tier unlocks every source, including those that require API keys or specialized parsing.
  • More sources means more signal diversity. The ML ensemble weighs each source by its historical predictive accuracy, so low-quality sources are automatically down-weighted.

News and RSS Feeds

News is one of the fastest-moving inputs in any trading system. A single headline can move a stock 5% in seconds. slmaj monitors multiple RSS feeds in real time, parses headlines for sentiment and relevance, and incorporates the results into signal generation within minutes of publication.

Reuters RSS
Provides breaking financial news, earnings reports, and macroeconomic headlines from one of the world's most established wire services. Reuters coverage spans US equities, global markets, currencies, and commodities. slmaj parses headline sentiment and cross-references mentioned tickers against the active watchlist.
Free. No API key required. Polled every 5 minutes. Tier: Basic and Max.

CNBC RSS
Delivers market-moving headlines, analyst upgrades and downgrades, sector rotation commentary, and Fed policy coverage. CNBC's editorial focus on US equities makes it particularly useful for stock-specific signals. The bot extracts named entities (companies, executives, products) and maps them to tradable symbols.
Free. No API key required. Polled every 5 minutes. Tier: Basic and Max.

AP News RSS
Covers geopolitical events, natural disasters, regulatory actions, and other macro-level developments that affect broad market sentiment. AP's reporting tends to be factual and low-noise, which makes it a reliable input for the sentiment model. Events like sanctions, trade policy changes, and government shutdowns are flagged as high-impact.
Free. No API key required. Polled every 5 minutes. Tier: Basic and Max.

News API
Aggregates articles from over 80,000 news sources worldwide, with full-text search by keyword, ticker, and date range. slmaj uses News API to backfill context around signals detected by other sources — for example, if StockTwits sentiment for NVDA spikes, News API checks whether there is a corresponding news catalyst. This source requires an API key; the free tier provides 100 requests per day, which is sufficient for most configurations.
Requires API key. Updated on demand per signal. Tier: Max only.

Social Sentiment

Retail trader sentiment often moves ahead of — or amplifies — institutional moves. Social platforms capture this sentiment in real time. slmaj processes social data through natural language processing to extract bullish/bearish signals, volume anomalies, and momentum shifts.

StockTwits
The largest social network dedicated to stock market discussion. StockTwits provides a real-time stream of user posts tagged with specific tickers, along with a crowd-sourced bullish/bearish sentiment label for each post. slmaj tracks the rolling sentiment ratio (bullish vs bearish posts) and message volume for each watched symbol. A sudden spike in volume or a sharp sentiment shift triggers an alert to the signal engine.
Free. No API key required. Polled every 10 minutes. Tier: Basic and Max.

Reddit
Monitors the subreddits where retail traders concentrate: r/wallstreetbets, r/stocks, and r/investing. slmaj tracks post frequency, upvote velocity, comment sentiment, and ticker mention counts. Unusual activity — such as a ticker going from 10 mentions per day to 500 — is flagged as a potential momentum event. The Reddit API requires OAuth credentials; the free tier allows 100 requests per minute, which is more than sufficient.
Requires API key. Polled every 15 minutes. Tier: Max only.

Twitter/X
Tracks financial commentary from verified analysts, institutional accounts, and high-follower-count traders. slmaj monitors keyword streams for ticker mentions, earnings reactions, and breaking news commentary. Twitter's real-time nature makes it one of the fastest sentiment indicators, though the signal-to-noise ratio is lower than dedicated financial platforms. The API key requirement limits this to the Max tier.
Requires API key. Streaming (near real-time). Tier: Max only.

Market Data

Market data forms the quantitative backbone of every signal. Price history, volume, fundamentals, and analyst ratings all feed into the ML models. slmaj uses multiple market data providers to ensure redundancy and cross-validation.

Yahoo Finance
The primary market data source. Yahoo Finance provides historical price data (OHLCV), real-time quotes (with a short delay for free users), company fundamentals (P/E ratio, revenue, EPS, market cap), analyst consensus ratings, and dividend data. slmaj uses Yahoo Finance for all technical indicator calculations (moving averages, RSI, MACD, Bollinger Bands) and for fundamental screening. No API key is required; data is accessed through the unofficial but well-maintained yfinance Python library.
Free. No API key required. Price data polled every 1 minute; fundamentals updated daily. Tier: Basic and Max.

Alpha Vantage
Provides premium fundamentals data (income statements, balance sheets, cash flow statements), forex rates for over 100 currency pairs, and technical indicator calculations via API. slmaj uses Alpha Vantage primarily for forex data and as a fallback for fundamental data when Yahoo Finance is unavailable. The free tier offers 25 API calls per day; the premium tier removes this limit.
Requires API key. Updated daily for fundamentals; real-time for forex. Tier: Max only.

Finnhub
Delivers real-time quote data, company news, earnings calendars, and insider transaction filings. Finnhub's websocket API provides sub-second quote updates for US equities, which slmaj uses for timing entries and exits on high-confidence signals. The company news endpoint surfaces articles that may not appear in RSS feeds. The free tier allows 60 API calls per minute.
Requires API key. Real-time via websocket. Tier: Max only.

Polygon.io
Provides tick-level and minute-level market data for US stocks, options, forex, and crypto. Polygon is the data provider behind many institutional trading platforms. slmaj uses Polygon for high-resolution intraday data when available, which improves the accuracy of short-term technical signals. The free tier offers 5 API calls per minute with delayed data; paid tiers provide real-time access.
Requires API key. Tick-level resolution. Tier: Max only.

Options and Flow Data

Options markets often lead equity markets. Large options trades — particularly unusual activity in short-dated contracts — can signal informed positioning ahead of catalysts. slmaj's options flow analysis detects these patterns and incorporates them into the signal model.

Options flow analysis
Monitors unusual options activity across the entire US options market. The system tracks put/call ratios (both volume and open interest), identifies large block trades, detects sweeps (aggressive orders that lift multiple ask levels), and flags unusual volume in specific strikes and expirations. For example, if a stock normally trades 1,000 call contracts per day and suddenly sees 15,000 in the first hour of trading — concentrated in near-term out-of-the-money strikes — that is flagged as a bullish flow signal. The system also monitors the VIX options chain and SKEW index for macro-level risk sentiment.
Free. No API key required. Updated every 15 minutes during market hours. Tier: Max only.

Government and Economic

Government filings and economic data provide fundamental context that purely technical models miss. Regulatory filings reveal what insiders and institutions are doing with their own capital. Economic indicators signal regime changes — from growth to recession, from low rates to high rates — that affect every asset class.

SEC EDGAR filings
The Securities and Exchange Commission's EDGAR database contains every public filing made by US-listed companies. slmaj monitors 10-K (annual reports), 10-Q (quarterly reports), 8-K (material events), Form 4 (insider transactions), and 13-F (institutional holdings) filings. Insider buying is one of the most historically reliable bullish indicators, particularly when multiple insiders buy simultaneously. The EDGAR full-text search API is free and rate-limited to 10 requests per second.
Free. No API key required. Polled every 30 minutes. Tier: Max only.

Congressional trades
Under the STOCK Act of 2012, members of the US Congress must disclose stock trades within 45 days. slmaj monitors these disclosures, which are published through periodic transaction reports. Congressional trading has historically outperformed the market, and several academic studies have documented this edge. The system tracks which members are buying or selling, the dollar amounts, and whether the trades cluster around upcoming legislation or committee hearings. Data is sourced from public disclosure databases.
Free. No API key required. Updated daily (disclosures published in batches). Tier: Max only.

FRED (Federal Reserve Economic Data)
The Federal Reserve Bank of St. Louis maintains FRED, the most comprehensive public database of US economic indicators. slmaj pulls interest rates (federal funds rate, 10-year Treasury yield), inflation metrics (CPI, PCE), employment data (unemployment rate, non-farm payrolls), GDP growth, money supply, and credit spreads. These indicators drive regime detection in the ML model — the strategy behaves differently in rising-rate vs falling-rate environments, for example. FRED requires a free API key (registration takes two minutes).
Requires API key. Updated on release schedule (monthly or quarterly depending on indicator). Tier: Max only.

Treasury yields
Provides daily yield curve data for US Treasury securities across all maturities (1-month through 30-year). slmaj uses yield curve shape (normal, flat, inverted) as a macroeconomic regime signal. An inverted yield curve — where short-term rates exceed long-term rates — has preceded every US recession since 1970 and triggers a defensive posture in the signal model (reduced position sizes, preference for defensive sectors). Data is sourced from the US Treasury Department's daily yield curve rates.
Free. No API key required. Updated daily. Tier: Max only.

Economic calendar
Tracks the schedule of market-moving economic releases: FOMC rate decisions, non-farm payrolls (NFP), Consumer Price Index (CPI), Producer Price Index (PPI), retail sales, and earnings dates for watched companies. slmaj uses the calendar to reduce position sizes ahead of high-impact events (when volatility is expected to spike) and to avoid opening new positions in the 30 minutes before a major release. This proactive risk reduction prevents being caught on the wrong side of a data surprise.
Free. No API key required. Updated weekly. Tier: Basic and Max.

Crypto and DeFi

Cryptocurrency markets operate 24/7 and are driven by a different set of fundamentals than equities. On-chain data, DeFi metrics, and prediction markets provide signals that have no equivalent in traditional finance.

CoinGecko
The most comprehensive free crypto data aggregator. CoinGecko provides real-time prices, market capitalization, 24-hour volume, circulating supply, and historical data for over 10,000 cryptocurrencies. slmaj uses CoinGecko for crypto price feeds, market dominance ratios (BTC dominance as a risk-on/risk-off indicator), and volume anomaly detection. The free tier allows 30 calls per minute, which is sufficient for the supported crypto pairs.
Free. No API key required. Polled every 5 minutes. Tier: Max only.

DeFi Llama
Tracks Total Value Locked (TVL) across decentralized finance protocols on all major blockchains. TVL is a fundamental metric for DeFi — analogous to assets under management in traditional finance. slmaj monitors TVL trends to gauge capital flows into and out of the DeFi ecosystem, which correlates with broader crypto market sentiment. A rising TVL during falling prices suggests accumulation; falling TVL during rising prices suggests distribution. All DeFi Llama data is free and open.
Free. No API key required. Updated every 30 minutes. Tier: Max only.

Polymarket
A prediction market platform where users trade contracts on real-world events — election outcomes, regulatory decisions, interest rate changes, and macroeconomic scenarios. slmaj ingests Polymarket odds for financially relevant events (e.g., the probability of a rate cut at the next FOMC meeting) and uses them as forward-looking sentiment indicators. Prediction markets have historically been more accurate than polls and expert forecasts, making them a valuable input for macro regime detection.
Free. No API key required. Polled every 30 minutes. Tier: Max only.

Alternative Data

Alternative data sources capture signals that traditional market data misses. Developer activity, public attention, and search trends can reveal shifts in company momentum before they show up in earnings or price action.

GitHub activity
Tracks repository commit frequency, contributor counts, and release cadence for publicly traded technology companies. A surge in development activity — more commits, more contributors, more frequent releases — can signal an upcoming product launch or major feature release. Conversely, declining activity may indicate internal problems or strategic pivots. slmaj monitors the public repositories of companies like MSFT (VS Code, TypeScript), GOOGL (TensorFlow, Kubernetes), and META (React, PyTorch). GitHub's public API is free with rate limits of 60 unauthenticated requests per hour.
Free. No API key required. Updated daily. Tier: Max only.

Wikipedia trends
Monitors page view counts for Wikipedia articles about publicly traded companies, their executives, and their products. Academic research has shown that spikes in Wikipedia page views correlate with increased trading volume and, in some cases, predict short-term price movements. When a company's Wikipedia page goes from 500 views per day to 50,000 — often due to a news event, controversy, or viral moment — that attention spike is flagged as a potential catalyst. The Wikimedia REST API is free and open.
Free. No API key required. Updated daily (with 24-hour lag). Tier: Max only.

Google Trends
Measures search interest for specific terms over time, normalized on a 0–100 scale. slmaj tracks Google Trends data for ticker symbols, company names, and product terms. A sudden increase in search volume for a ticker often precedes a large price move — earnings surprises, FDA approvals, acquisition rumors, and scandals all generate search spikes before the full impact is priced in. The Google Trends API requires authentication but is free to use.
Requires API key. Updated daily. Tier: Max only.

Summary Comparison Table

The table below lists every data source, whether it is free, whether it requires an API key, which tier includes it, and how frequently slmaj fetches updates.

Source Free? API Key? Tier Update Frequency
Reuters RSS Yes No Basic and Max Every 5 min
CNBC RSS Yes No Basic and Max Every 5 min
AP News RSS Yes No Basic and Max Every 5 min
News API Free tier Yes Max On demand
StockTwits Yes No Basic and Max Every 10 min
Reddit Free tier Yes Max Every 15 min
Twitter/X No Yes Max Near real-time
Yahoo Finance Yes No Basic and Max Every 1 min
Alpha Vantage Free tier Yes Max Daily / real-time
Finnhub Free tier Yes Max Real-time
Polygon.io Free tier Yes Max Tick-level
Options flow analysis Yes No Max Every 15 min
SEC EDGAR filings Yes No Max Every 30 min
Congressional trades Yes No Max Daily
FRED Free tier Yes Max Per release schedule
Treasury yields Yes No Max Daily
Economic calendar Yes No Basic and Max Weekly
CoinGecko Yes No Max Every 5 min
DeFi Llama Yes No Max Every 30 min
Polymarket Yes No Max Every 30 min
GitHub activity Yes No Max Daily
Wikipedia trends Yes No Max Daily
Google Trends Yes Yes Max Daily

Frequently Asked Questions

Do I need to configure all 25+ data sources to start trading?

No. The Basic tier works out of the box with zero configuration. It uses free sources — RSS feeds, StockTwits, Yahoo Finance, and the economic calendar — that require no API keys. These sources alone provide enough signal diversity for the ML ensemble to generate trades. Adding more sources through the Max tier improves signal quality, but it is not a prerequisite for getting started.

What happens if a data source goes down or returns bad data?

slmaj monitors the health of every data source through its data quality engine. If a source fails to respond, returns stale data, or produces values outside expected ranges, it is automatically excluded from signal generation until it recovers. The ML ensemble adjusts its weights in real time, so losing one source does not disable the system — it simply operates with the remaining healthy sources. All data quality events are logged for review.

Are the free API tiers sufficient, or do I need paid subscriptions?

For most users, the free tiers are sufficient. slmaj is designed to stay within free-tier rate limits by caching responses, batching requests, and staggering polling intervals. The bot never makes redundant API calls. If you trade a large number of symbols or need tick-level data from Polygon.io, a paid subscription may be worthwhile — but it is not required for normal operation.

How does the ML ensemble weigh different data sources?

Each data source is assigned a dynamic weight based on its historical predictive accuracy for the specific asset class being traded. Sources that have been accurate in the recent past receive higher weights; sources that have been noisy or uncorrelated with outcomes receive lower weights. This weighting is recalculated continuously. The result is that even if you enable a low-quality source, it will not degrade signal quality — the ensemble will learn to ignore it. See the features page for more on the ML ensemble architecture.