Inference (Machine Learning)
Inference is the stage at which a trained machine-learning model is used to make predictions on new, live data, as opposed to the training stage where it learns. In industry, inference often runs continuously on streaming sensor data.
Training and inference have very different demands. Training is compute-heavy and done periodically; inference must run repeatedly, often in real time and sometimes on constrained edge hardware close to the equipment. Inference latency, throughput and cost therefore shape how and where a model is deployed. Edge AI exists precisely to run inference locally for fast, reliable predictions without round-tripping data to the cloud.