🎯 Project Overview
The Complexity Audio Classifier is a sophisticated machine sound analysis system that uses novel complexity metrics derived from information theory to detect multi-level machine degradation. Unlike traditional audio classification methods that rely on domain-specific features, this system analyzes the fundamental structural and algorithmic properties of sound.
📂 Source Code Repository
The complete source code for this project is available on GitHub:
🔗 GitHub RepositoryHealthy
Strong harmonics, minimal noise, stable frequency
Early Warning
Slight instability, minor artifacts, subtle resonances
Moderate Issue
Noticeable wobble, regular artifacts, multiple resonances
Severe Issue
Highly unstable, frequent artifacts, bursts of noise
🏗️ System Architecture
🔑 Key Features
Binary Split Game (BSG)
Measures structural complexity through recursive pattern analysis
Recursive Bilateral Symmetry (RBS)
Detects hierarchical palindromic structures in signals
Hybrid Pipeline
Fast first-pass filtering with detailed multi-level classification
Information Theory Foundation
Based on Kolmogorov complexity and Shannon entropy principles
📚 Theoretical Foundation
The Complexity Audio Classifier is grounded in fundamental principles of information theory and algorithmic complexity. Understanding these concepts is crucial to appreciating how the system works.
🧮 Kolmogorov Complexity
The theoretical ideal for measuring complexity is Kolmogorov Complexity - the length of the shortest computer program that can generate a given string. However, this is uncomputable, so we use practical approximations.
Complexity Examples:
- Simple Signal: Pure sine wave - highly predictable, low complexity
- Random Signal: White noise - unpredictable, medium complexity
- Complex Signal: Speech/Music - structured but varied, high meaningful complexity
📊 Shannon Entropy
Shannon entropy quantifies the uncertainty in a signal. For a probability distribution P, entropy is calculated as:
Spectral Entropy Application:
- Compute power spectrum via FFT
- Normalize to create probability distribution
- Apply Shannon entropy formula
- Normalize by maximum possible entropy
🎵 Audio Processing Pipeline
- Audio Loading: Convert to 22050 Hz, mono, normalized
- Spectrogram Computation: Short-time Fourier Transform (STFT)
- Binarization: Convert to binary representation using adaptive thresholding
- Feature Extraction: Apply BSG and RBS algorithms
- Classification: Random Forest classifier on extracted features
🎮 Binary Split Game (BSG) Algorithm
The Binary Split Game is a recursive algorithm that analyzes binary strings by splitting them in half and comparing corresponding positions. It provides a computable approximation of algorithmic complexity.
🔄 How BSG Works
Try the BSG Algorithm:
📏 BSG Metrics
Structural Complexity (SC)
Sum of lengths of all strings in the reduction path
Depth
Number of reduction steps before termination
Max Width
Length of the initial binary string
Final State
Terminating state: '0', '1', or 'Null'
🎯 BSG in Audio Analysis
Research findings show that BSG assigns higher structural complexity to ordered signals with clear spectral patterns than to chaotic signals. This counterintuitive result makes BSG particularly effective for distinguishing healthy (ordered) machine sounds from failing (chaotic) ones.
🔄 Recursive Bilateral Symmetry (RBS) Algorithm
RBS detects hierarchical palindromic structures in strings, representing a nested arrangement of symmetrical patterns. It's particularly effective at identifying self-similar structures in audio signals.
🧩 Understanding RBS
RBS Example: "011110111110"
Structure Analysis:
- Core "101" at positions 5-7: RBS Order 0
- Framed by "11": "11(101)11" → "1110111": RBS Order 1
- Framed by "01" and "10": "01(1110111)10" → Full string: RBS Order 2
🔍 RBS Algorithm Demo
📊 RBS Patterns in Machine Sounds
Healthy Machines
Higher RBS order (~15.5)
More symmetrical patterns
Early Warning
Moderate RBS order (~10.1)
Some symmetry loss
Moderate Issue
Lower RBS order (~5.5)
Significant asymmetry
Severe Issue
Lowest RBS order (~3.6)
Minimal symmetry
🔄 Hybrid Classification Pipeline
The system uses a two-stage hybrid approach that combines fast filtering with detailed analysis for optimal performance and accuracy.
⚡ Stage 1: Adaptive Complexity Filter
Fast First-Pass Filtering
- Quick Metrics: Calculate BSG complexity, RBS order, transition counts
- Threshold Comparison: Compare against calibrated thresholds
- Binary Classification: Healthy vs. Potentially Failing
- Decision: Proceed to detailed analysis if needed
🔬 Stage 2: Enhanced Hybrid Classifier
Detailed Multi-Level Analysis
Feature Extraction
- BSG Complexity
- RBS Orders (max, avg)
- Hamming Weight
- Transition Counts
- Normalized Metrics
Classification
- Random Forest Classifier
- 100 decision trees
- Four-class output
- Feature importance ranking
📈 Performance Metrics
System Performance
Accuracy
87.5%
Across four degradation levels
BSG Speed
0.0017s
Per sample calculation
RBS Speed
0.0502s
Per sample calculation
Total Processing
0.0711s
Complete feature extraction
🚀 Interactive Demo
Try the algorithms with your own data and see how they work in real-time!