How to Measure Drift in ML Embeddings
We evaluated five embedding drift detection methods- 21653Murphy ≡ DeepGuide
4 ways to encode categorical features with high cardinality
We explore 4 methods to encode categorical variables with high cardinality: target encoding, count encoding, feature hashing and embedding.- 23859Murphy ≡ DeepGuide
How to Evaluate Representations
From unsupervised to supervised metrics- 24125Murphy ≡ DeepGuide
Visualizing Stochastic Regularization for Entity Embeddings
A glimpse into how neural networks perceive categoricals and their hierarchies- 26550Murphy ≡ DeepGuide
Running a SOTA 7B Parameter Embedding Model on a Single GPU
In this post I will explain how to run a state-of-the-art 7B parameter LLM based embedding model on just a single 24GB GPU. I will cover some theory and then show how to run it with the HuggingFace Transformers library in Python in just a few lines of cod- 20655Murphy ≡ DeepGuide
Multimodal AI Search for Business Applications
Enabling businesses to extract real value from their data- 20917Murphy ≡ DeepGuide
Text Embeddings: Comprehensive Guide
Evolution, visualisation, and applications of text embeddings- 27522Murphy ≡ DeepGuide
How to Create Powerful Embeddings from Your Data to Feed into Your AI
This article will show you different approaches you can take to create embeddings for your data- 23131Murphy ≡ DeepGuide
OpenAI vs Open-Source Multilingual Embedding Models
Choosing the model that works best for your data- 27775Murphy ≡ DeepGuide
How to Test Graph Quality to Improve Graph Machine Learning Performance
Testing the quality of your graphs is vital to ensure their performance in your machine learning system- 26979Murphy ≡ DeepGuide
Statistical Method scDEED Detects Dubious t-SNE and UMAP Embeddings and Optimizes Hyperparameters
scDEED assigns a reliability score to each 2D embedding to indicate how much the data point's mid-range neighbors change in the 2D space- 25375Murphy ≡ DeepGuide
How to Create Powerful AI Representations by Combining Multimodal Information
Learn how you can incorporate multimodal information into your machine-learning system- 26478Murphy ≡ DeepGuide
Combine Text Embeddings and Knowledge (Graph) Embeddings in RAG systems
In my previous articles, I wrote about using Knowledge Graphs in conjunction with RAGs and how Graph techniques can be used for Adaptive...- 26792Murphy ≡ DeepGuide
What Is a Latent Space?
A concise explanation for the general reader- 21433Murphy ≡ DeepGuide
Are GPTs Good Embedding Models
A surprising experiment to show that the devil is in the details- 22451Murphy ≡ DeepGuide
Voyage Multilingual 2 Embedding Evaluation
Compared to OpenAI, Cohere, Google, and E5- 26747Murphy ≡ DeepGuide
Embeddings Are Kind of Shallow
What I learned doing semantic search on U.S. Presidents with four language model embeddings- 21963Murphy ≡ DeepGuide
Working with Embeddings: Closed versus Open Source
Using techniques to improve semantic search- 27963Murphy ≡ DeepGuide
Dance between dense and sparse embeddings: Enabling Hybrid Search in LangChain-Milvus
How to create and search multi-vector-store in langchain-milvus This blog post was co-authored by Omri Levy and Ohad Eytan, as part of the work we have done in IBM Research Israel. Intro Recently, we – at IBM Research – needed to use hybrid se- 26665Murphy ≡ DeepGuide
We look at an implementation of the HyperLogLog cardinality estimati
Using clustering algorithms such as K-means is one of the most popul
Level up Your Data Game by Mastering These 4 Skills
Learn how to create an object-oriented approach to compare and evalu
When I was a beginner using Kubernetes, my main concern was getting
Tutorial and theory on how to carry out forecasts with moving averag