Generate Music Recommendations Utilizing LangChain Agents

Author:Murphy  |  View: 26917  |  Time: 2025-03-22 22:32:31
Image from Unsplash by Marcela Laskoski

As we've explored in previous articles, Large Language Models (LLMs) contain a vast amount of knowledge and are able to answer questions across many domains just from the data that they have already been fine-tuned on. In the past we've analyzed how we can utilize techniques such as Retrieval Augmented Generation (RAG) to enhance responses by providing access to additional data that can help a model generate more accurate outputs. While RAG and Fine-Tuning can be used to familiarize a model with a specific data/knowledge base, sometimes models need access to data sources that are subject to change.

An example of this is real-time data sources. For instance if we asked a model the weather today, it would be unable to generate a proper response.

ChatGPT Response (Screenshot by author)

A large problem with LLMs in general is they don't have access to external data sources. There's a certain timeframe on which the model was trained and thus might not have access to the latest and greatest information that we need to provide a proper answer.

How can we augment our LLM with real-time data access? Agents are a Generative AI construct that take it a level above simple question answering or text generation. With Agents, we provide access to different tools or APIs and allow the model itself to reason the right course of action to take. In the case of the weather example, we would give our Agent access to a Weather API for instance and allow it to retrieve the necessary data to answer the above question.

In the theme of enabling data access, I thought of a problem I have at times. I'm an avid music listener and sometimes I would just like to ask for some top song recommendations from artists that I have heard of. In the past people have built Recommender Engines/Systems to solve personalization use-cases such as this. While this is still applicable, a solution without having to build a specific model or undergo training would be much more for my simple use-case.

In this example we take a look at how we can create an Agent using LangChain and integrate it with the Spotify API (Spotipy) to generate music recommendations based off of my queries. Why can't we use RAG? You could, if you have a large database of all songs already accumulated, but you would have to constantly update and maintain this data source. With Agents we have access to the latest data available via the Spotify API. In use-cases with real-time or any API based data access it's ideal to go the Agent driven route. Note that in certain use-cases you can even provide access to your RAG workflow via your Agent and let it determine if that's the right course of action to take. Now that we've established the problem, let's get to building out the solution.

NOTE: This article assumes an intermediate understanding of Python, LLMs, and LangChain. A great starter article for LangChain can be found here.

DISCLAIMER: I am a Machine Learning Architect at AWS and my opinions are my own.

Table of Contents

  1. Solution Overview
  2. Agent Driven Solution a. Setup b. Custom Tool Class c. Agent Creation & Invocation

  3. Additional Resources & Conclusion

1. Solution Overview

For our solution we'll be working with an Agent Driven approach. Traditional LLMs have a simple flow of thought: question and answer. With Agents a more logical human like approach is introduced centered around reasoning. Specifically we will be exploring a ReAct (Reasoning + Acting) Agent in our use-case. With a ReAct Agent, we allow for an observation and thought (Reasoning) before taking an action.

How do we enable this Agent and how is it different from traditional LLMs? We provide this Agent with Tools that essentially give access to different data sources such as the Spotify API in this use-case. Within the Tools themselves we give proper natural language understanding on when to use these Tools. The LLM from these instructions will deduce the proper Tool to use depending on the user input.

High-Level Architecture (Made By Author)

Let's get an understanding of the stack that we will be using to implement this solution:

  • LangChain: LangChain is a popular Python framework that helps simplify Generative AI applications by providing ready made modules that help with Prompt Engineering, RAG implementation, and LLM workflow orchestration. In this specific use-case we use LangChain to build out our ReAct Agent.
  • Agents: LangChain provides Agents of different types, in this case we will use a ReAct Agent and provide it with access to the necessary tools in our
  • Tools: LangChain comes with a variety of built-in tools such as Wikipedia, Google Search, and more. In the case that there's not native API integration, you can build a custom Tool. For our project we build a custom Tool that integrates with the Spotify API and returns music recommendations.
  • Spotipy (Spotify API): Spotipy is the Spotify open-source Python package that we will use to interact with the service. To get started you will need to create a Spotify project and get your credentials. Please refer to this guide for setup.
  • Bedrock Claude: We will utilize Claude via Amazon Bedrock as the LLM that is powering our Agent. Bedrock is also natively integrated with the LLM class within LangChain, making it easier to build out our Agent.

Now that we understand the different parts of our solution, let's look at how we can build it all out.

2. Agent Driven Solution

a. Setup

For this example, we will be working in a SageMaker Studio Notebook with a ml.t3.medium instance. You can work in the development environment of your choice as long as you can install the following libraries:

!pip install spotipy Langchain

Note that before we get started you need to create a Spotify developer account if you do not have one. Within your Spotify developer account ensure that you have created an app, this will display the credentials that are needed to work with the API. Once this has been created you should be able to visualize your credentials (Settings tab) and API requests within your project on the Dashboard.

Spotify Dashboard (Screenshot by Author)

Within our notebook we then instantiate our client that we will be using to work with Spotify.

import spotipy
import spotipy.util as util
from spotipy.oauth2 import SpotifyClientCredentials
import random

client_id = 'Enter Client ID'
client_secret = 'Enter Client Secret'

# instantitate spotipy client
sp = spotipy.Spotify(client_credentials_manager=
SpotifyClientCredentials(client_id=client_id,
client_secret=client_secret))

Now that we have our Spotify client setup we are ready to get to the Agent Orchestration portion.

b. Custom Tool Class

As we discussed LangChain Agents need access to Tools that will enable them to work with external data sources. There are many built-in tools such as Wikipedia where you can simply specify the package in LangChain like the following:

from langchain_community.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper

At the moment, there's no native integration with the Spotify API, so we need to inherit from the BaseTool class and build a Spotify Tool that we will then give to our Agent.

We define a Spotify Tool that extends the BaseTool class:

from langchain.tools import BaseTool, StructuredTool, tool
class SpotifyTool(BaseTool):
    name = "Spotify Music Recommender"
    description = "Use this tool when asked music recommendations."

Note that we provide a description of when to use this Tool, this allows for the Llm to use natural language understanding to infer what Tool to use when. We also provide a schema of the inputs that the tool should expect. In this case we specify two parameters:

  • Artists: The list of artists that we are interested in, based off of this the LLM will recommend more top tracks from that artist.
  • Number of Tracks: The number of suggested tracks that I'd like to be displayed.
from langchain.pydantic_v1 import BaseModel, Field

# schema
class MusicInput(BaseModel):
    artists: list = Field(description="A list of artists that they'd like to see music from")
    tracks: int = Field(description="The number of tracks/songs they want returned.")

class SpotifyTool(BaseTool):
    name = "Spotify Music Recommender"
    description = "Use this tool when asked music recommendations."
    args_schema: Type[BaseModel] = MusicInput # define schema

Now that we've given an understanding of what inputs the LLM should look for in the prompt we can define a few different methods for working with the Spotipy package:

# utils
  @staticmethod
  def retrieve_id(artist_name: str) -> str:
      results = sp.search(q='artist:' + artist_name, type='artist')
      if len(results) > 0:
          artist_id = results['artists']['items'][0]['id']
      else:
          raise ValueError(f"No artists found with this name: {artist_name}")
      return artist_id

  @staticmethod
  def retrieve_tracks(artist_id: str, num_tracks: int) -> list:
      if num_tracks > 10:
          raise ValueError("Can only provide up to 10 tracks per artist")
      tracks = []
      top_tracks = sp.artist_top_tracks(artist_id)
      for track in top_tracks['tracks'][:num_tracks]:
          tracks.append(track['name'])
      return tracks

  @staticmethod
  def all_top_tracks(artist_array: list) -> list:
      complete_track_arr = []
      for artist in artist_array:
          artist_id = SpotifyTool.retrieve_id(artist)
          all_tracks = {artist: SpotifyTool.retrieve_tracks(artist_id, 10)}
          complete_track_arr.append(all_tracks)
      return complete_track_arr

These methods essentially take the artists that you have retrieved and return the top tracks of those artists. Note that currently for the Spotipy API only the top 10 tracks can be retrieved.

We then define a main execution function where we take all the top tracks of the requested artists and parse it for the amount of tracks we have requested in our prompt:

# main execution
def _run(self, artists: list, tracks: int) -> list:
    num_artists = len(artists)
    max_tracks = num_artists * 10
    all_tracks_map = SpotifyTool.all_top_tracks(artists) # map for artists with top 10 tracks
    all_tracks = [track for artist_map in all_tracks_map for artist, tracks in artist_map.items() for track in tracks] #complete list of tracks

    # only 10 tracks per artist
    if tracks > max_tracks:
        raise ValueError(f"Only 10 tracks per artist, max tracks for this many artists is: {max_tracks}")
    final_tracks = random.sample(all_tracks, tracks)
    return final_tracks

In the case that you want to bake in extra functionality with the API (building your own playlist), you can define these extra methods within this tool itself.

Now that we have defined our custom Tool class, we can focus on stitching it together with the other components to create an Agent.

c. Agent Creation & Invocation

While we've defined the input/output specs for what our Agent needs, we must define the LLM which will be the brains of the operation.

In this case we use Anthropic Claude via Amazon Bedrock:

from langchain.llms import Bedrock
model_id = "anthropic.claude-v2:1"
model_params = {"max_tokens_to_sample": 500,
                "top_k": 100,
                "top_p": .95,
                "temperature": .5}
llm = Bedrock(
    model_id=model_id,
    model_kwargs=model_params
)

# sample Bedrock Inference
llm("What is the capitol of the United States?")

We can then instantiate our Tool class and bring this with the LLM to create our Agent. Note that we specify the Agent type as ReAct, adjust this as needed depending on the type of Agent you are using.

from langchain.agents import initialize_agent, Tool
from langchain.agents import AgentType

tools = [SpotifyTool()]
agent = initialize_agent(tools, llm, 
agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION, verbose = True)

Once our Agent has been constructed we can run a sample inference (random mix of artists I know) and see the Chain of Thought as we've enabled verbosity.

print(agent.run("""I like the following artists: [Arijit Singh, Future, 
The Weekend], can I get 12 song recommendations with them in it."""))
Agent Response One (Screenshot by Author)
Agent Response Two (Screenshot by Author)

Notice that the Agent specifically looks for the two parameters that we have specified to use this Tool, once it has identified those parameters it is able to take the logical course of action and correctly perform inference with the values we have submitted.

Note that in this prompt we directly pass in an array, but you can make this cleaner by structuring your prompt with a LangChain Prompt Template.

3. Additional Resources & Conclusion

GenAI-Samples/LangChain-Agents-Spotify/langchain-agents-spotify.ipynb at master ·…

The code for the entire example can be found at the link above along with my other Generative AI code samples. The power with Agents is that you dictate the backend functionality of what can be retrieved. In this sample, we simply retrieve the top tracks for artists, but you can extend this sample to add features such as directly creating a playlist within your Spotify account itself by adding the appropriate API calls within your Tool. Note that for more real-world or personalized use-cases you can also utilize RAG to give your Agent access to your own data/music and let it derive suggestions from what is accessible there.

I hope this article was a useful introduction into how you can integrate APIs with your Generative AI Agents to solve challenging problems. Stay tuned for more content in this space!

As always thank you for reading and feel free to leave any feedback.


If you enjoyed this article feel free to connect with me on LinkedIn and subscribe to my Medium Newsletter.

Tags: AWS Bedrock Generative Ai Tools Langchain Llm

Comment