Model Compression for Production AI โ€” Making Large Models Fast Enough for Real-World Deployment