1/ Distributed Data Ingestion Layer
Dash AI framework is underpinned by a highly optimized distributed data ingestion pipeline, engineered to aggregate, process, and normalize real-time data streams from both on-chain and off-chain sources. This pipeline operates with ultra-low latency (<50ms), ensuring a continuous flow of high-fidelity data essential for predictive analytics and decision-making.
Primary Data Sources
Dash integrates and processes data from multiple sources, categorized as follows:
On-Chain Nodes:
Direct integration with Sonic RPC endpoints enables the extraction of:
Block-Level Data (BtB_tBt​): Includes block hashes, timestamps, and transaction counts.
Transaction Logs (TxT_xTx​): Captures wallet interactions, token transfers, and program invocations.
Token-Specific Activities (AsA_sAs​): Tracks minting, burning, and staking operations.
DEX Aggregators:
APIs from Equalizer, Shadow, SwapX and Dyorswap provide real-time market metrics:
Order Book Depth (OdO_dOd​): Quantifies buy and sell orders across various price levels.
Slippage Rates (SpS_pSp​): Measures the price impact of trades of varying sizes.
Liquidity Shifts (LsL_sLs​): Detects changes in liquidity pools for specific trading pairs.
Off-Chain APIs:
Sentiment analytics platforms scrape data points from social networks like Twitter, Telegram, and Discord:
Data Volume (Np>10,000/sN_p > 10,000/sNp​>10,000/s): High-throughput scraping captures large volumes of posts and messages.
NLP Sentiment Scoring: Integrates Natural Language Processing models to derive sentiment metrics for tokens and projects.
Data Prioritization Protocol
Dash employs a Data Prioritization Protocol to manage and rank incoming data streams based on their criticality and relevance. This ensures that the most impactful metrics are processed with higher priority.
Hierarchical Data Weighting:
Metrics are ranked using a weighted priority index (PiP_iPi​), calculated as: Pi=Critical Metric WeightTotal Weight×100P_i = \frac{\text{Critical Metric Weight}}{\text{Total Weight}} \times 100Pi​=Total WeightCritical Metric Weight​×100
Example:
If Transaction Spikes (TsT_sTs​) are assigned a weight of 60%, and Secondary Metrics (MsecM_{sec}Msec​) are assigned 40%, then: Pi(Ts)=60100×100=60%P_i(T_s) = \frac{60}{100} \times 100 = 60\%Pi​(Ts​)=10060​×100=60%
This prioritizes TsT_sTs​ over MsecM_{sec}Msec​, ensuring transaction anomalies are processed first.
Dynamic Re-Sampling:
High-frequency data streams (f>1,000 Hzf > 1,000\, Hzf>1,000Hz) are recalibrated at fixed intervals (10 ms10\, ms10ms) to prevent temporal distortions caused by rapid fluctuations.
Re-Sampling Algorithm:
Input data points are aggregated within a sliding time window: Xre-sampled=1n∑i=1nXiX_{\text{re-sampled}} = \frac{1}{n} \sum_{i=1}^{n} X_iXre-sampled​=n1​i=1∑n​Xi​
Where nnn is the number of data points in a 10ms window, and XiX_iXi​ represents individual metrics.
Example:
If 5 transaction events occur within 10ms, their aggregated value Xre-sampledX_{\text{re-sampled}}Xre-sampled​ is used for further processing to reduce noise while preserving signal integrity.
Latency and Throughput
Dash distributed ingestion system achieves exceptional performance metrics:
Latency: Data is processed and ingested within <50ms<50ms<50ms, ensuring near-instantaneous availability for downstream analytics.
Throughput: Capable of handling over 500,000 data points per second, making it highly scalable for Sonic high-throughput blockchain.
Real-World Use Case
Imagine a scenario where a token experiences a sudden liquidity shift on Metamask, accompanied by a surge in negative sentiment on Twitter. Dash ingestion layer prioritizes:
Liquidity Changes (LsL_sLs​): Flags unusual withdrawals or deposits in liquidity pools.
Transaction Spikes (TsT_sTs​): Detects abnormal activity in the token’s associated wallets.
Social Sentiment (NpN_pNp​): Evaluates the impact of negative sentiment using NLP-derived scores.
By processing and prioritizing these metrics in real-time, Dash generates actionable insights, enabling traders to react swiftly to market anomalies.
Last updated