Simple Flask web app to deploy a model
Description
A simple Flask web application can be used to deploy machine learning models, making them accessible via web APIs. Flask is a lightweight Python web framework that allows you to create RESTful endpoints. You can train a model, save it using pickle or joblib, then load it in a Flask app to serve predictions via HTTP requests.
Key Concepts
- Train and save machine learning models (pickle/joblib)
- Load the saved model inside Flask to avoid reloading on each request
- Flask routes handle requests and responses
- Input validation and JSON-based prediction responses
- Extendable for production use with Gunicorn, Docker, etc.
Example Code
1. Train and Save Model (Python Script)
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
import pickle
# Load example dataset
data = load_iris()
X, y = data.data, data.target
# Train a simple RandomForest model
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X, y)
# Save the trained model to disk
with open('model.pkl', 'wb') as f:
pickle.dump(model, f)
print("Model saved successfully.")
2. Flask App to Load & Serve the Saved Model
from flask import Flask, request, jsonify
import pickle
import numpy as np
app = Flask(__name__)
# Load the saved model once when app starts
with open('model.pkl', 'rb') as f:
model = pickle.load(f)
@app.route('/')
def home():
return "Welcome to the Iris Classifier API!"
@app.route('/predict', methods=['POST'])
def predict():
try:
# Parse JSON input
data = request.get_json(force=True)
features = data['features']
# Convert to numpy array and reshape for prediction
input_data = np.array(features).reshape(1, -1)
# Predict class
prediction = model.predict(input_data)
# Return predicted class label and probabilities
prediction_proba = model.predict_proba(input_data)
return jsonify({
'prediction': int(prediction[0]),
'probabilities': prediction_proba[0].tolist()
})
except Exception as e:
return jsonify({'error': str(e)})
if __name__ == '__main__':
app.run(debug=True)
Real-World Applications
Model Deployment with Flask
- Rapid Prototyping: Quickly deploy models for demos or proof-of-concept APIs.
- Microservices: Serve models as standalone services in production environments.
- Integration: Connect ML models with web or mobile apps via REST APIs.
- Scalability: Use Gunicorn, Docker, and cloud platforms to scale deployments.

Resources
The following resources will be manually added later:
Video Tutorials
PDF/DOC Materials
Interview Questions
1. Why use Flask for deploying machine learning models?
Flask is lightweight, easy to use, and allows quick creation of REST APIs to serve ML models. It’s ideal for prototyping and small to medium-scale deployments.
2. How do you handle input validation in a Flask prediction API?
Validate JSON payloads, check for required keys, verify data types and shapes, and handle exceptions gracefully to prevent server errors.
3. How can you improve the performance of a Flask ML API in production?
Use production-grade servers like Gunicorn, enable concurrency with multiple workers, use caching, containerize with Docker, and deploy behind reverse proxies like Nginx.
4. How do you load a machine learning model in a Flask app?
Load the serialized model file (pickle/joblib) once when the Flask app starts to avoid loading overhead on each prediction request.
5. How do you test a Flask prediction API?
Use Postman or curl to send POST requests with JSON input, verify prediction responses, and write automated tests using pytest with Flask's test client.