Why OpenAI's API Is More Expensive for Non-English Languages
Beyond words: How byte pair encoding and Unicode encoding factor into pricing disparities- 22272Murphy ≡ DeepGuide
Byte-Pair Encoding For Beginners
An illustrative guide to BPE tokenizer in plain simple language- 23185Murphy ≡ DeepGuide
Structured Generative AI
How to constrain your model to output defined formats- 27805Murphy ≡ DeepGuide
The Art of Tokenization: Breaking Down Text for AI
Demystifying NLP: From Text to Embeddings- 27852Murphy ≡ DeepGuide
Under-trained and Unused tokens in Large Language Models
Existence of under-trained and unused tokens and Identification Techniques using GPT-2 Small as an Example- 28391Murphy ≡ DeepGuide
This Is How LLMs Break Down the Language
The science and art behind tokenization- 21503Murphy ≡ DeepGuide
LettuceDetect: A Hallucination Detection Framework for RAG Applications
How to capitalize on ModernBERT’s extended context window to build a token-level classifier for hallucination detection- 21276Murphy ≡ DeepGuide
We look at an implementation of the HyperLogLog cardinality estimati
Using clustering algorithms such as K-means is one of the most popul
Level up Your Data Game by Mastering These 4 Skills
Learn how to create an object-oriented approach to compare and evalu
When I was a beginner using Kubernetes, my main concern was getting
Tutorial and theory on how to carry out forecasts with moving averag