A small language model with surprisingly good performance.
Would be great if we can have serverless endpoint for it. Please consider using the full precision.