Keras API, eager execution, tf.data, SavedModel, TFLite, TF Serving, and distributed training.
TensorFlow is Google's open-source deep learning framework. Keras is its high-level API for building and training models quickly. TensorFlow 2.x integrates Keras as the default API.
import tensorflow as tf
from tensorflow import keras
print(tf.__version__) # e.g., 2.17.0
# ── Tensors ──
scalar = tf.constant(42)
vector = tf.constant([1.0, 2.0, 3.0])
matrix = tf.constant([[1, 2], [3, 4]], dtype=tf.float32)
zeros = tf.zeros((3, 3))
ones = tf.ones((2, 4))
random = tf.random.normal((3, 3), mean=0, stddev=1)
# Tensor operations
a = tf.constant([[1, 2], [3, 4]])
b = tf.constant([[5, 6], [7, 8]])
print(a + b) # Element-wise addition
print(tf.matmul(a, b)) # Matrix multiplication
print(tf.reduce_mean(a)) # Mean: 2.5
print(tf.reshape(a, (4, 1))) # Reshape
# GPU check
print("GPUs Available:", tf.config.list_physical_devices('GPU'))
# ── Eager vs Graph Execution ──
# Eager: default in TF2, executes immediately
# Graph: use @tf.function for performance
@tf.function
def compute(a, b):
return tf.matmul(a, b)
result = compute(tf.ones((1000, 1000)), tf.ones((1000, 1000)))| Feature | TensorFlow 2.x | PyTorch |
|---|---|---|
| Creator | Google Brain | Meta AI (FAIR) |
| API Style | Keras (high-level) + tf API | Pythonic, torch.nn modules |
| Graph Mode | @tf.function decorator | torch.compile (optional) |
| Deployment | TF Serving, TF Lite, TF.js, SavedModel | TorchServe, ONNX, TensorRT |
| Mobile/Edge | TF Lite (mature), TF Micro | PyTorch Mobile, ExecuTorch |
| Research Popularity | Industry/production | Research community (majority) |
| Visualization | TensorBoard (built-in) | TensorBoard (via torch.utils.tensorboard) |
| Learning Curve | Easier for beginners (Keras) | More Pythonic, intuitive for researchers |
| TPU Support | Native, first-class | Via PyTorch/XLA (less mature) |
Keras provides a modular approach to building neural networks. Layers are the building blocks; models define how layers connect.
from tensorflow import keras
from tensorflow.keras import layers
# ── Sequential API ──
model = keras.Sequential([
layers.Input(shape=(28, 28, 1)),
layers.Conv2D(32, 3, activation='relu'),
layers.MaxPooling2D(2),
layers.Conv2D(64, 3, activation='relu'),
layers.MaxPooling2D(2),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dropout(0.5),
layers.Dense(10, activation='softmax'),
])
model.summary()
# ── Functional API ──
inputs = keras.Input(shape=(784,))
x = layers.Dense(256, activation='relu')(inputs)
x = layers.BatchNormalization()(x)
x = layers.Dropout(0.3)(x)
x = layers.Dense(128, activation='relu')(x)
outputs = layers.Dense(10, activation='softmax')(x)
model = keras.Model(inputs=inputs, outputs=outputs, name="mlp_classifier")
# ── Custom Layers ──
class ResidualBlock(layers.Layer):
def __init__(self, units, **kwargs):
super().__init__(**kwargs)
self.dense1 = layers.Dense(units, activation='relu')
self.dense2 = layers.Dense(units)
self.bn = layers.BatchNormalization()
def call(self, inputs, training=False):
x = self.dense1(inputs)
x = self.dense2(x)
x = self.bn(x, training=training)
return layers.Activation('relu')(x + inputs) # Skip connection
# ── Common Layer Types ──
layers.Dense(64, activation='relu') # Fully connected
layers.Conv2D(32, 3, padding='same') # 2D convolution
layers.Conv1D(64, 5, activation='relu') # 1D convolution (text, time series)
layers.LSTM(128, return_sequences=True) # RNN layer
layers.GRU(64) # Gated recurrent unit
layers.Bidirectional(layers.LSTM(64)) # Bidirectional wrapper
layers.Embedding(10000, 128) # Word embeddings
layers.MultiHeadAttention(num_heads=8, key_dim=64) # Transformer attention
layers.GlobalAveragePooling2D() # Global average pooling
layers.BatchNormalization() # Batch normalization
layers.LayerNormalization() # Layer normalization| Layer | Use Case | Key Args | Output Shape |
|---|---|---|---|
| Dense | Fully connected layers | units, activation, use_bias | (batch, units) |
| Conv2D | Image feature extraction | filters, kernel_size, strides, padding | (batch, H, W, filters) |
| Conv1D | Sequence/text features | filters, kernel_size, strides | (batch, seq_len, filters) |
| MaxPooling2D | Spatial downsampling | pool_size, strides | (batch, H/ps, W/ps, ch) |
| LSTM | Sequential data processing | units, return_sequences, return_state | (batch, seq, units) |
| GRU | Lighter sequential processing | units, return_sequences | (batch, seq, units) |
| Embedding | Word/token to vector | input_dim, output_dim | (batch, seq, embed_dim) |
| BatchNormalization | Stabilize training | momentum, epsilon | Same as input |
| Dropout | Regularization | rate | Same as input |
| Flatten | Reshape to 1D | - | (batch, H*W*C) |
Training a neural network in Keras involves compiling (loss, optimizer, metrics), fitting (epochs, batches), and evaluating on test data.
from tensorflow import keras
from tensorflow.keras import layers, callbacks
# ── Build Model ──
model = keras.Sequential([
layers.Input(shape=(28, 28, 1)),
layers.Conv2D(32, 3, activation='relu'),
layers.MaxPooling2D(2),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dropout(0.5),
layers.Dense(10, activation='softmax'),
])
# ── Compile ──
model.compile(
optimizer=keras.optimizers.Adam(learning_rate=0.001),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'],
)
# ── Callbacks ──
cb_list = [
callbacks.EarlyStopping(patience=5, restore_best_weights=True),
callbacks.ModelCheckpoint('best_model.keras', save_best_only=True),
callbacks.ReduceLROnPlateau(factor=0.5, patience=3, min_lr=1e-6),
callbacks.TensorBoard(log_dir='./logs'),
]
# ── Train ──
history = model.fit(
x_train, y_train,
epochs=50,
batch_size=64,
validation_split=0.2,
callbacks=cb_list,
verbose=1,
)
# ── Evaluate ──
test_loss, test_acc = model.evaluate(x_test, y_test)
print(f"Test accuracy: {test_acc:.4f}")
# ── Predict ──
predictions = model.predict(x_test[:5])
predicted_classes = tf.argmax(predictions, axis=1)
# ── Custom Training Loop ──
optimizer = keras.optimizers.Adam(learning_rate=1e-3)
loss_fn = keras.losses.SparseCategoricalCrossentropy()
@tf.function
def train_step(x, y):
with tf.GradientTape() as tape:
logits = model(x, training=True)
loss = loss_fn(y, logits)
grads = tape.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(grads, model.trainable_variables))
return loss
for epoch in range(10):
for step, (x_batch, y_batch) in enumerate(train_dataset):
loss = train_step(x_batch, y_batch)| Optimizer | Learning Rate | Key Feature | Best For |
|---|---|---|---|
| SGD | 0.01-0.1 | Simple, well-understood, with momentum | Large datasets, fine-tuning with low LR |
| Adam | 0.001 (default) | Adaptive LR per parameter, momentum | General-purpose, most common default |
| AdamW | 0.001 | Adam with decoupled weight decay | Transformer training, better generalization |
| RMSprop | 0.001 | Adaptive LR based on recent gradients | RNNs, non-stationary objectives |
| Adagrad | 0.01 | Accumulated squared gradients, decaying LR | Sparse gradients (NLP, embeddings) |
| Nadam | 0.001 | Adam + Nesterov momentum | Image classification, NLP |
| Lion | 0.001 | Memory-efficient, sign-based updates | Large models, memory-constrained training |
| Loss | Task | Output Activation | Use When |
|---|---|---|---|
| binary_crossentropy | Binary classification | sigmoid | 2-class problems (spam/not spam) |
| categorical_crossentropy | Multi-class (one-hot) | softmax | Multi-class with one-hot labels |
| sparse_categorical_crossentropy | Multi-class (integers) | softmax | Multi-class with integer labels |
| mse (Mean Squared Error) | Regression | linear (none) | Predicting continuous values |
| mae (Mean Absolute Error) | Regression (robust) | linear | Regression with outliers |
| huber | Regression (smooth) | linear | Combines MSE/MAE benefits |
| cosine_similarity | Embeddings/similarity | - | Face recognition, embeddings |
tf.data is TensorFlow's API for building efficient input pipelines. It handles reading, preprocessing, batching, and prefetching data for training.
import tensorflow as tf
# ── From NumPy Arrays ──
dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
dataset = dataset.shuffle(10000).batch(32).prefetch(tf.data.AUTOTUNE)
# ── From Image Directory ──
train_ds = keras.utils.image_dataset_from_directory(
'./data/images/',
validation_split=0.2,
subset='training',
seed=42,
image_size=(224, 224),
batch_size=32,
)
# ── From CSV ──
dataset = tf.data.experimental.make_csv_dataset(
'./data.csv',
batch_size=32,
label_name='target',
num_epochs=1,
)
# ── Advanced Pipeline with Augmentation ──
AUTOTUNE = tf.data.AUTOTUNE
def augment(image, label):
image = tf.image.random_flip_left_right(image)
image = tf.image.random_brightness(image, max_delta=0.2)
image = tf.image.random_contrast(image, lower=0.8, upper=1.2)
return image, label
train_ds = (train_ds
.map(augment, num_parallel_calls=AUTOTUNE)
.shuffle(1000)
.batch(32)
.prefetch(AUTOTUNE)
)
# ── TFRecord for Large Datasets ──
# Write TFRecord
feature = {
'image': tf.train.Feature(bytes_list=tf.train.BytesList(
value=[image_bytes])),
'label': tf.train.Feature(int64_list=tf.train.Int64List(
value=[label_id])),
}
example = tf.train.Example(features=tf.train.Features(feature=feature))
with tf.io.TFRecordWriter('data.tfrecord') as writer:
writer.write(example.SerializeToString())
# Read TFRecord
def parse_fn(example_proto):
feature_desc = {
'image': tf.io.FixedLenFeature([], tf.string),
'label': tf.io.FixedLenFeature([], tf.int64),
}
parsed = tf.io.parse_single_example(example_proto, feature_desc)
image = tf.io.decode_jpeg(parsed['image'], channels=3)
return image, parsed['label']
dataset = tf.data.TFRecordDataset('data.tfrecord').map(parse_fn)Transfer learning reuses knowledge from a model pre-trained on large datasets (ImageNet, etc.) for your specific task. It dramatically reduces training time and data requirements.
from tensorflow import keras
from tensorflow.keras import layers, applications
# ── Feature Extraction (freeze base) ──
base_model = applications.MobileNetV2(
weights='imagenet',
input_shape=(224, 224, 3),
include_top=False,
)
base_model.trainable = False # Freeze all layers
model = keras.Sequential([
base_model,
layers.GlobalAveragePooling2D(),
layers.Dense(256, activation='relu'),
layers.Dropout(0.5),
layers.Dense(num_classes, activation='softmax'),
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# ── Fine-Tuning (unfreeze top layers) ──
base_model.trainable = True
# Freeze bottom layers, fine-tune top N layers
for layer in base_model.layers[:-20]:
layer.trainable = False
model.compile(
optimizer=keras.optimizers.Adam(learning_rate=1e-5), # Very low LR!
loss='sparse_categorical_crossentropy',
metrics=['accuracy'],
)
# ── Popular Pre-trained Models ──
# applications.ResNet50(weights='imagenet', include_top=False)
# applications.EfficientNetB0(weights='imagenet')
# applications.VGG16(weights='imagenet')
# applications.InceptionV3(weights='imagenet')
# applications.ConvNeXtBase(weights='imagenet')| Model | Top-1 Acc | Params | Size | Best For |
|---|---|---|---|---|
| MobileNetV2 | 71.8% | 3.4M | 14 MB | Mobile, edge, real-time |
| MobileNetV3-Small | 67.4% | 2.5M | 11 MB | Ultra-light mobile apps |
| EfficientNet-B0 | 77.1% | 5.3M | 21 MB | Best accuracy-to-size ratio |
| EfficientNet-B7 | 84.3% | 66M | 256 MB | Max accuracy from efficient family |
| ResNet50 | 76.0% | 25.6M | 98 MB | General-purpose baseline |
| ResNet152 | 78.3% | 60.2M | 232 MB | High accuracy, research |
| InceptionV3 | 77.9% | 23.8M | 92 MB | Multi-scale features |
| ConvNeXt-Base | 85.3% | 88.6M | 350 MB | Modern CNN, matches ViT |
| VGG16 | 71.3% | 138M | 528 MB | Feature extraction, simple |
TensorFlow provides multiple deployment paths: SavedModel for serving, TF Lite for mobile/embedded, and TF.js for web browsers.
# ── SavedModel (for TF Serving) ──
model.save('my_model') # Creates a SavedModel directory
model.save('my_model.keras') # Single-file Keras format (TF 2.13+)
# Load
loaded_model = keras.models.load_model('my_model.keras')
# ── TF Lite Conversion ──
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT] # Quantization
tflite_model = converter.convert()
with open('model.tflite', 'wb') as f:
f.write(tflite_model)
# ── TF Lite Inference ──
interpreter = tf.lite.Interpreter(model_path='model.tflite')
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
output = interpreter.get_tensor(output_details[0]['index'])
# ── ONNX Export (via tf2onnx) ──
import tf2onnx
spec = (tf.TensorSpec((None, 224, 224, 3), tf.float32, name="input"),)
model_onnx, _ = tf2onnx.convert.from_keras(model, input_signature=spec)
with open("model.onnx", "wb") as f:
f.write(model_onnx.SerializeToString())| Platform | Format | Size Reduction | Use Case |
|---|---|---|---|
| TF Serving (REST/gRPC) | SavedModel | None | Cloud production APIs |
| TF Lite | .tflite | 3-4x (with quantization) | Android, iOS, Raspberry Pi, ESP32 |
| TF Lite Micro | .tflite | 4-10x | Microcontrollers (STM32, Arduino) |
| TF.js | JSON + weights | None | Browser, Node.js |
| ONNX Runtime | .onnx | Varies | Cross-platform, C#/C++ apps |
| TensorRT | .plan | 2-3x (FP16/INT8) | NVIDIA GPUs, max inference speed |
Convolutional Neural Networks (CNNs) are the backbone of computer vision. They learn spatial hierarchies of features through convolutional and pooling layers.
from tensorflow import keras
from tensorflow.keras import layers
# ── Simple CNN for CIFAR-10 ──
model = keras.Sequential([
layers.Input(shape=(32, 32, 3)),
# Block 1
layers.Conv2D(32, 3, padding='same', activation='relu'),
layers.BatchNormalization(),
layers.Conv2D(32, 3, padding='same', activation='relu'),
layers.BatchNormalization(),
layers.MaxPooling2D(2),
layers.Dropout(0.25),
# Block 2
layers.Conv2D(64, 3, padding='same', activation='relu'),
layers.BatchNormalization(),
layers.Conv2D(64, 3, padding='same', activation='relu'),
layers.BatchNormalization(),
layers.MaxPooling2D(2),
layers.Dropout(0.25),
# Classifier
layers.Flatten(),
layers.Dense(512, activation='relu'),
layers.BatchNormalization(),
layers.Dropout(0.5),
layers.Dense(10, activation='softmax'),
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# ── Data Augmentation Layer (Keras 3) ──
data_augmentation = keras.Sequential([
layers.RandomFlip('horizontal'),
layers.RandomRotation(0.1),
layers.RandomZoom(0.1),
layers.RandomContrast(0.1),
])
model = keras.Sequential([
data_augmentation,
layers.Rescaling(1./255),
# ... rest of model
])| Architecture | Year | Innovation | Params |
|---|---|---|---|
| LeNet-5 | 1998 | First successful CNN (handwriting) | 60K |
| AlexNet | 2012 | Deep CNN + ReLU + dropout | 60M |
| VGG16 | 2014 | Uniform 3x3 convolutions, depth | 138M |
| GoogLeNet | 2014 | Inception modules (multi-scale) | 6.8M |
| ResNet | 2015 | Skip connections (residual learning) | 25M (ResNet50) |
| DenseNet | 2017 | Dense connections between layers | 8M (DenseNet121) |
| EfficientNet | 2019 | Compound scaling (depth, width, res) | 5.3M (B0) |
| ConvNeXt | 2022 | Modernized CNN, matches ViT | 88M (Base) |
Regularization techniques prevent overfitting and improve model generalization. Debugging ML models requires understanding common failure modes and how to diagnose them.
from tensorflow.keras import layers, regularizers
# ── Weight Regularization ──
layers.Dense(64, activation='relu',
kernel_regularizer=regularizers.l2(0.01))
layers.Conv2D(32, 3,
kernel_regularizer=regularizers.l1_l2(l1=1e-5, l2=1e-4))
# ── Dropout ──
layers.Dropout(0.5) # Drop 50% of neurons during training
# ── Early Stopping ──
early_stop = keras.callbacks.EarlyStopping(
monitor='val_loss',
patience=10,
restore_best_weights=True,
min_delta=0.001,
)
# ── Learning Rate Scheduler ──
lr_schedule = keras.optimizers.schedules.CosineDecay(
initial_learning_rate=1e-3,
decay_steps=10000,
)
optimizer = keras.optimizers.Adam(learning_rate=lr_schedule)
# ── Gradient Clipping ──
optimizer = keras.optimizers.Adam(
learning_rate=1e-3,
clipnorm=1.0, # Clip by norm
# clipvalue=0.5, # Clip by value
)
# ── Debugging: Check for NaN ──
# Use gradient clipping, lower LR, check data normalization
# Check for exploding/vanishing gradients
# Use tensorboard to monitor weights and gradients| Symptom | Likely Cause | Solution |
|---|---|---|
| Training loss not decreasing | LR too high/low, data issue | Try LR=0.001, check labels, normalize data |
| Train loss low, val loss high | Overfitting | More dropout, L2 reg, data augmentation, reduce model |
| Both losses high | Underfitting | Increase model capacity, train longer, check data |
| NaN loss | Exploding gradients | Lower LR, gradient clipping, check for zeros in data |
| Loss oscillating | LR too high or batch too small | Reduce LR, increase batch size |
| Very slow training | Data pipeline bottleneck | Use tf.data prefetch, increase num_parallel_calls |
Keras Tuner automates hyperparameter search using strategies like Random Search, Bayesian Optimization, and Hyperband.
import keras_tuner as kt
from tensorflow import keras
from tensorflow.keras import layers
def build_model(hp):
model = keras.Sequential()
model.add(layers.Input(shape=(784,)))
# Tune number of layers
for i in range(hp.Int('num_layers', 1, 4)):
model.add(layers.Dense(
units=hp.Int(f'units_{i}', min_value=32, max_value=512, step=32),
activation=hp.Choice('activation', ['relu', 'tanh']),
kernel_regularizer=keras.regularizers.l2(
hp.Float('l2', 1e-5, 1e-2, sampling='log')
),
))
model.add(layers.Dropout(hp.Float('dropout', 0.1, 0.5, step=0.1)))
model.add(layers.Dense(10, activation='softmax'))
model.compile(
optimizer=keras.optimizers.Adam(
learning_rate=hp.Float('lr', 1e-4, 1e-2, sampling='log')
),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'],
)
return model
tuner = kt.Hyperband(
build_model,
objective='val_accuracy',
max_epochs=30,
factor=3,
directory='tuner_results',
project_name='mnist_tuning',
)
tuner.search(x_train, y_train, epochs=30,
validation_split=0.2)
best_hps = tuner.get_best_hyperparameters(num_trials=1)[0]
best_model = tuner.hypermodel.build(best_hps)Common TensorFlow interview questions with detailed answers covering core concepts, training, deployment, and troubleshooting.