Use GPU runtime: `Runtime > Change runtime type > GPU`
Verify GPU availability: `!nvidia-smi`
Use mixed precision: `tf.keras.mixed_precision.set_global_policy(“mixed_float16”)`
Reduce input image size when possible
Use `tf.data` pipelines instead of Python generators
Enable dataset caching: `.cache()`
Enable dataset prefetching: `.prefetch(tf.data.AUTOTUNE)`
Use parallel mapping: `.map(…, num_parallel_calls=tf.data.AUTOTUNE)`
Use batch sizes that fit GPU memory
Keep images in efficient formats and avoid repeated decoding
Store datasets in Google Drive only if necessary
Copy data to local Colab storage (`/content`) for faster access
Use EfficientNetB3 only if accuracy gain justifies compute cost
Freeze the EfficientNetB3 base model during initial training
Train the classification head first
Unfreeze only the top layers for fine-tuning
Use a lower learning rate for fine-tuning
Use early stopping to avoid unnecessary epochs
Use model checkpointing to save only the best weights
Use learning rate scheduling
Use `include_top=False` for transfer learning
Use `pooling=”avg”` to reduce parameter count
Avoid large fully connected layers after EfficientNetB3
Use dropout to reduce overfitting without increasing model size
Limit augmentation to efficient operations
Prefer built-in Keras augmentation layers over slow custom code
Use smaller validation sets when rapid iteration is needed
Monitor GPU memory usage during training
Clear unused variables and sessions: `tf.keras.backend.clear_session()`
Restart runtime when memory fragmentation becomes an issue
Save checkpoints to Google Drive only when needed
Use TensorBoard only if actively monitoring training
Disable unnecessary notebook outputs
Minimize cell reruns by structuring code cleanly
Use smaller `steps_per_epoch` for quick experiments
Profile bottlenecks before optimizing further
Reduce model input resolution if latency matters
Export the final model in a lightweight format
Use TensorFlow Lite for deployment efficiency
Use XLA compilation when beneficial: `tf.function(jit_compile=True)`
Keep preprocessing consistent between training and inference
Avoid excessive callbacks that add overhead
Use a small number of workers if data loading is CPU-bound
Pin down random seeds for reproducible experiments
Save only essential artifacts
Remove unused imports and variables
Use notebook cells sparingly and combine related operations
Prefer vectorized operations over loops
Use float32 labels and tensors only when required
Convert categorical labels once, not repeatedly
Use class weights only when class imbalance requires them
Use `tf.keras.applications.efficientnet.preprocess_input` if applicable
Ensure images are resized once in the pipeline
Use `drop_remainder=True` for stable batch shapes when helpful
Avoid unnecessary model recompilation
Use `model.fit` with efficient dataset objects
Keep the number of trainable parameters low
Use transfer learning instead of training EfficientNetB3 from scratch
