Machine Learning - Lorentzian KNN Classifier — Strategy by Mark_Novak
By Mark_Novak
Performance Metrics
- Author: Mark_Novak
- Symbol: OANDA:WTICOUSD
- Timeframe: 1 hour
- Net P&L: +59,611.22 USD (+5.96%)
- Win Rate: 99.7%
- Profit Factor: 46.395
- Max Drawdown: 1,974.83 USD (0.19%)
- Total Trades: 308
Description
**Technical Architecture Specification: Machine Learning Lorentzian KNN Classifier****1. System Overview**The script provided is a comprehensive algorithmic trading system operating under the BertTradeTech architecture. It utilizes a Machine Learning approach, specifically a K-Nearest Neighbors (KNN) classifier, to predict forward price direction. Rather than relying on a single indicator, the system acts as a multi-layer confluence engine, aggregating probabilistic signals from the machine learning model, Smart Money Concepts (SMC), volume delta, and multi-timeframe trend alignment.The system processes real-time tick data to manage dynamic risk, drawing algorithmic boundaries via a Kernel Regression Envelope and calculating position sizing based on automated Kelly Criterion variables.**2. Mathematical Framework: Non-Euclidean Classification**The core of the predictive model is the KNN algorithm mapped to a non-Euclidean feature space. Traditional Euclidean distance squares the differences between data points, which disproportionately penalizes outliers—a critical flaw when dealing with the heavy-tailed, leptokurtic distributions inherent in financial markets.To mitigate this, the system calculates the Lorentzian distance between the current feature vector $x$ and historical feature vectors $y$.The Lorentzian distance $d(x,y)$ across $n$ dimensions is defined as:$$d(x,y) = \sum_{i=1}^{n} \ln(1 + |x_i - y_i|)$$This logarithmic scaling ensures that extreme market anomalies (e.g., flash crashes, sudden volatility spikes) do not distort the nearest-neighbor clustering. Once the $K$ nearest neighbors are identified within the specified training window, the algorithm applies Inverse Distance Weighting (IDW) to determine the final classification. Closer historical neighbors exert a stronger pull on the current signal.The weight $w$ for a given distance $d$ is calculated with a boundary limit to prevent division by zero:$$w = \frac{1}{\max(d, 0.01)}$$**3. Feature Engineering & Signal Processing**The KNN classifier evaluates a 4-dimensional feature space. Prior to calculating the Lorentzian distance, the script standardizes the inputs to ensure scale invariance:* **Feature 1:** Relative Strength Index (RSI)* **Feature 2:** WaveTrend Oscillator (derived from moving averages of typical price)* **Feature 3:** Commodity Channel Index (CCI)* **Feature 4:** Average Directional Index (ADX)The features are normalized over a rolling 50-period window by subtracting the simple moving average and dividing by the standard deviation. The output is a continuous score smoothed by an Exponential Moving Average (EMA), yielding a bounded "BULL," "BEAR," or "NEUTRAL" classification with an associated confidence percentage.**4. Confluence Engine and Filtering Topologies**The machine learning output contributes approximately 10% to the total signal strength. The remaining 90% is determined by a rigorous confluence scoring matrix requiring alignment across multiple distinct market paradigms:* **Order Flow Analysis:** Evaluates Cumulative Volume Delta (CVD) relative to a simple moving average, calculating the ratio of buying volume to selling volume. It also maps price action relative to standard deviation bands of the Volume Weighted Average Price (VWAP).* **Smart Money Concepts (SMC):** Dynamically plots institutional market structure. The script detects Order Blocks (OB), Fair Value Gaps (FVG), Liquidity Sweeps, and Breaks of Structure (BOS). Demand and supply zones are identified via ATR-multiplied price impulses.* **Multi-Timeframe (MTF) & Trend Ribbon:** Checks the alignment of a customizable moving average ribbon (up to 8 layers) alongside higher-timeframe EMA verification to prevent entries into macro-level counter-trends.* **Volatility Regime Filtering:** Computes the width of Bollinger Bands relative to historical averages to classify the market state as "expanding" or "contracting."**5. Execution Logic and Risk Parameters**Trade execution requires a high confluence score generated by three primary entry models:1. **Trend Following:** Requires volume spikes confirming EMA crossovers.2. **Breakout:** Triggers on new highs/lows confirmed by volume anomalies.3. **Mean Reversion:** Executes when price deviates outside the Bollinger Bands, confirmed by RSI extremes.Risk management is completely dynamic and defined by the Average True Range (ATR). Stop losses and take profits scale mathematically with current market volatility. The system features trailing stop logic, calculating displacement via ATR multipliers to lock in unrealized equity as a trade matures.**6. Empirical Backtesting Results**The provided strategy reports demonstrate significant robustness across varied asset classes and timeframes, indicating that the multi-layer confluence approach effectively filters out market noise.* **Cryptocurrency (Bitcoin / USD):** The strategy maintained a precise equity curve, generating +12,628.09 USD with a 100% win rate across 41 trades and an exceptionally low maximal equity drawdown of 0.18%.* **Precious Metals (Gold / EUR):** As noted, the strategy exhibits an aggressive profit factor of 47.189, capturing +28,240.64 EUR with a 99.57% win rate (231 out of 232 trades profitable) on the 1-hour chart, verifying its efficacy in structured, trending environments.Uploaded Image: Screenshot 2026-03-02 104656.jpg* **Commodities (West Texas Oil):** Showcases the highest nominal gain in the set, returning +59,611.22 USD over 308 trades, maintaining a 99.68% win rate.* **Equities (US Russ 2000 & Micro E-mini Nasdaq-100):** Both indices show high adaptivity. The Russell 2000 generated +61,110.47 USD (99.65% win rate), while the Nasdaq-100 returned +21,806.50 USD. Both charts reflect steady, programmatic equity expansion with minimal downside deviation.