Identify idle endpoints in Amazon SageMaker

Identify idle endpoints in Amazon SageMaker

Amazon SageMaker is a machine learning (ML) platform designed to simplify the process of building, training, deploying, and managing ML models at scale. With a comprehensive suite of tools and services, SageMaker offers developers and data scientists the resources they need to accelerate the development and deployment of ML solutions.

In today’s fast-paced technological landscape, efficiency and agility are essential for businesses and developers striving to innovate. AWS plays a critical role in enabling this innovation by providing a range of services that abstract away the complexities of infrastructure management. By handling tasks such as provisioning, scaling, and managing resources, AWS allows developers to focus more on their core business logic and iterate quickly on new ideas.

As developers deploy and scale applications, unused resources such as idle SageMaker endpoints can accumulate unnoticed, leading to higher operational costs. This post addresses the issue of identifying and managing idle endpoints in SageMaker. We explore methods to monitor SageMaker endpoints effectively and distinguish between active and idle ones. Additionally, we walk through a Python script that automates the identification of idle endpoints using Amazon CloudWatch metrics.

Identify idle endpoints with a Python script

To effectively manage SageMaker endpoints and optimize resource utilization, we use a Python script that uses the AWS SDK for Python (Boto3) to interact with SageMaker and CloudWatch. This script automates the process of querying CloudWatch metrics to determine endpoint activity and identifies idle endpoints based on the number of invocations over a specified time period.

Let’s break down the key components of the Python script and explain how each part contributes to the identification of idle endpoints:

  • Global variables and AWS client initialization – The script begins by importing necessary modules and initializing global variables such as NAMESPACE, METRIC, LOOKBACK, and PERIOD. These variables define parameters for querying CloudWatch metrics and SageMaker endpoints. Additionally, AWS clients for interacting with SageMaker and CloudWatch services are initialized using Boto3.
from datetime import datetime, timedelta
import boto3
import logging

# AWS clients initialization
cloudwatch = boto3.client("cloudwatch")
sagemaker = boto3.client("sagemaker")

# Global variables
NAMESPACE = "AWS/SageMaker"
METRIC = "Invocations"
LOOKBACK = 1  # Number of days to look back for activity
PERIOD = 86400  # We opt for a granularity of 1 Day to reduce the volume of metrics retrieved while maintaining accuracy.

# Calculate time range for querying CloudWatch metrics
ago = datetime.utcnow() - timedelta(days=LOOKBACK)
now = datetime.utcnow()
  • Identify idle endpoints – Based on the CloudWatch metrics data, the script determines whether an endpoint is idle or active. If an endpoint has received no invocations over the defined period, it’s flagged as idle. In this case, we select a cautious default threshold of zero invocations over the analyzed period. However, depending on your specific use case, you can adjust this threshold to suit your requirements.
# Helper function to extract endpoint name from CloudWatch metric

def get_endpoint_name_from_metric(metric):
    for d in metric["Dimensions"]:
        if d["Name"] == "EndpointName" or d["Name"] == "InferenceComponentName" :
            yield d["Value"]

# Helper Function to aggregate individual metrics for a designated endpoint and output the total. This validation helps in determining if the endpoint has been idle during the specified period.

def list_metrics():
    paginator = cloudwatch.get_paginator("list_metrics")
    response_iterator = paginator.paginate(Namespace=NAMESPACE, MetricName=METRIC)
    return [m for r in response_iterator for m in r["Metrics"]]


# Helper function to check if endpoint is in use based on CloudWatch metrics

def is_endpoint_busy(metric):
    metric_values = cloudwatch.get_metric_data(
        MetricDataQueries=[{
            "Id": "metricname",
            "MetricStat": {
                "Metric": {
                    "Namespace": metric["Namespace"],
                    "MetricName": metric["MetricName"],
                    "Dimensions": metric["Dimensions"],
                },
                "Period": PERIOD,
                "Stat": "Sum",
                "Unit": "None",
            },
        }],
        StartTime=ago,
        EndTime=now,
        ScanBy="TimestampAscending",
        MaxDatapoints=24 * (LOOKBACK + 1),
    )
    return sum(metric_values.get("MetricDataResults", [{}])[0].get("Values", [])) > 0

# Helper function to log endpoint activity

def log_endpoint_activity(endpoint_name, is_busy):
    status = "BUSY" if is_busy else "IDLE"
    log_message = f"{datetime.utcnow()} - Endpoint {endpoint_name} {status}"
    print(log_message)
  • Main function – The main() function serves as the entry point to run the script. It orchestrates the process of retrieving SageMaker endpoints, querying CloudWatch metrics, and logging endpoint activity.
# Main function to identify idle endpoints and log their activity status
def main():
    endpoints = sagemaker.list_endpoints()["Endpoints"]
    
    if not endpoints:
        print("No endpoints found")
        return

    existing_endpoints_name = []
    for endpoint in endpoints:
        existing_endpoints_name.append(endpoint["EndpointName"])
    
    for metric in list_metrics():
        for endpoint_name in get_endpoint_name_from_metric(metric):
            if endpoint_name in existing_endpoints_name:
                is_busy = is_endpoint_busy(metric)
                log_endpoint_activity(endpoint_name, is_busy)
            else:
                print(f"Endpoint {endpoint_name} not active")

if __name__ == "__main__":
    main()

By following along with the explanation of the script, you’ll gain a deeper understanding of how to automate the identification of idle endpoints in SageMaker, paving the way for more efficient resource management and cost optimization.

Permissions required to run the script

Before you run the provided Python script to identify idle endpoints in SageMaker, make sure your AWS Identity and Access Management (IAM) user or role has the necessary permissions. The permissions required for the script include:

  • CloudWatch permissions – The IAM entity running the script must have permissions for the CloudWatch actions cloudwatch:GetMetricData and cloudwatch:ListMetrics
  • SageMaker permissions – The IAM entity must have permissions to list SageMaker endpoints using the sagemaker:ListEndpoints action

Run the Python script

You can run the Python script using various methods, including:

  • The AWS CLI – Make sure the AWS Command Line Interface (AWS CLI) is installed and configured with the appropriate credentials.
  • AWS Cloud9 – If you prefer a cloud-based integrated development environment (IDE), AWS Cloud9 provides an IDE with preconfigured settings for AWS development. Simply create a new environment, clone the script repository, and run the script within the Cloud9 environment.

In this post, we demonstrate running the Python script through the AWS CLI.

Actions to take after identifying idle endpoints

After you’ve successfully identified idle endpoints in your SageMaker environment using the Python script, you can take proactive steps to optimize resource utilization and reduce operational costs. The following are some actionable measures you can implement:

  • Delete or scale down endpoints – For endpoints that consistently show no activity over an extended period, consider deleting or scaling them down to minimize resource wastage. SageMaker allows you to delete idle endpoints through the AWS Management Console or programmatically using the AWS SDK.
  • Review and refine the model deployment strategy – Evaluate the deployment strategy for your ML models and assess whether all deployed endpoints are necessary. Sometimes, endpoints may become idle due to changes in business requirements or model updates. By reviewing your deployment strategy, you can identify opportunities to consolidate or optimize endpoints for better efficiency.
  • Implement auto scaling policies – Configure auto scaling policies for active endpoints to dynamically adjust the compute capacity based on workload demand. SageMaker supports auto scaling, allowing you to automatically increase or decrease the number of instances serving predictions based on predefined metrics such as CPU utilization or inference latency.
  • Explore serverless inference options – Consider using SageMaker serverless inference as an alternative to traditional endpoint provisioning. Serverless inference eliminates the need for manual endpoint management by automatically scaling compute resources based on incoming prediction requests. This can significantly reduce idle capacity and optimize costs for intermittent or unpredictable workloads.

Conclusion

In this post, we discussed the importance of identifying idle endpoints in SageMaker and provided a Python script to help automate this process. By implementing proactive monitoring solutions and optimizing resource utilization, SageMaker users can effectively manage their endpoints, reduce operational costs, and maximize the efficiency of their machine learning workflows.

Get started with the techniques demonstrated in this post to automate cost monitoring for SageMaker inference. Explore AWS re:Post for valuable resources on optimizing your cloud infrastructure and maximizing AWS services.

Resources

For more information about the features and services used in this post, refer to the following:


About the authors

Pablo Colazurdo is a Principal Solutions Architect at AWS where he enjoys helping customers to launch successful projects in the Cloud. He has many years of experience working on varied technologies and is passionate about learning new things. Pablo grew up in Argentina but now enjoys the rain in Ireland while listening to music, reading or playing D&D with his kids.

Ozgur Canibeyaz is a Senior Technical Account Manager at AWS with 8 years of experience. Ozgur helps customers optimize their AWS usage by navigating technical challenges, exploring cost-saving opportunities, achieving operational excellence, and building innovative services using AWS products.

Read More

Indian language RAG with Cohere multilingual embeddings and Anthropic Claude 3 on Amazon Bedrock

Indian language RAG with Cohere multilingual embeddings and Anthropic Claude 3 on Amazon Bedrock

Media and entertainment companies serve multilingual audiences with a wide range of content catering to diverse audience segments. These enterprises have access to massive amounts of data collected over their many years of operations. Much of this data is unstructured text and images. Conventional approaches to analyzing unstructured data for generating new content rely on the use of keyword or synonym matching. These approaches don’t capture the full semantic context of a document, making them less effective for users’ search, content creation, and several other downstream tasks.

Text embeddings use machine learning (ML) capabilities to capture the essence of unstructured data. These embeddings are generated by language models that map natural language text into their numerical representations and, in the process, encode contextual information in the natural language document. Generating text embeddings is the first step to many natural language processing (NLP) applications powered by large language models (LLMs) such as Retrieval Augmented Generation (RAG), text generation, entity extraction, and several other downstream business processes.

Cohere Multilingual V3 converting text to embeddings

Converting text to embeddings using cohere multilingual embedding model

Despite the rising popularity and capabilities of LLMs, the language most often used to converse with the LLM, often through a chat-like interface, is English. And although progress has been made in adapting open source models to comprehend and respond in Indian languages, such efforts fall short of the English language capabilities displayed among larger, state-of-the-art LLMs. This makes it difficult to adopt such models for RAG applications based on Indian languages.

In this post, we showcase a RAG application that can search and query across multiple Indian languages using the Cohere Embed – Multilingual model and Anthropic Claude 3 on Amazon Bedrock. This post focuses on Indian languages, but you can use the approach with other languages that are supported by the LLM.

Solution overview

We use the Flores dataset [1], a benchmark dataset for machine translation between English and low-resource languages. This also serves as a parallel corpus, which is a collection of texts that have been translated into one or more languages.

With the Flores dataset, we can demonstrate that the embeddings and, subsequently, the documents retrieved from the retriever, are relevant for the same question being asked in multiple languages. However, given the sparsity of the dataset (approximately 1,000 lines per language from more than 200 languages), the nature and number of questions that can be asked against the dataset is limited.

After you have downloaded the data, load the data into the pandas data frame for processing. For this demo, we are restricting ourselves to Bengali, Kannada, Malayalam, Tamil, Telugu, Hindi, Marathi, and English. If you are looking to adopt this approach for other languages, make sure the language is supported by both the embedding model and the LLM that’s being used in the RAG setup.

Load the data with the following code:

import pandas as pd

df_ben = pd.read_csv('./data/Flores/dev/dev.ben_Beng', sep='t') 
df_kan = pd.read_csv('./data/Flores/dev/dev.kan_Knda', sep='t') 
df_mal = pd.read_csv('./data/Flores/dev/dev.mal_Mlym', sep='t') 
df_tam = pd.read_csv('./data/Flores/dev/dev.tam_Taml', sep='t') 
df_tel = pd.read_csv('./data/Flores/dev/dev.tel_Telu', sep='t') 
df_hin = pd.read_csv('./data/Flores/dev/dev.hin_Deva', sep='t') 
df_mar = pd.read_csv('./data/Flores/dev/dev.mar_Deva', sep='t') 
df_eng = pd.read_csv('./data/Flores/dev/dev.eng_Latn', sep='t') 
# Choose fewer/more languages if needed

df_all_Langs = pd.concat([df_ben, df_kan, df_mal, df_tam, df_tel, df_hin, df_mar,df_eng], axis=1)
df_all_Langs.columns = ['Bengali', 'Kannada', 'Malayalam', 'Tamil', 'Telugu', 'Hindi', 'Marathi','English']

df_all_Langs.shape #(996,8)


df = df_all_Langs
stacked_df = df.stack().reset_index() # for ease of handling

# select only the required columns, rename them
stacked_df = stacked_df.iloc[:,[1,2]]
stacked_df.columns = ['language','text'] 

The Cohere multilingual embedding model

Cohere is a leading enterprise artificial intelligence (AI) platform that builds world-class LLMs and LLM-powered solutions that allow computers to search, capture meaning, and converse in text. They provide ease of use and strong security and privacy controls.

The Cohere Embed – Multilingual model generates vector representations of documents for over 100 languages and is available on Amazon Bedrock. With Amazon Bedrock, you can access the embedding model through an API call, which eliminates the need to manage the underlying infrastructure and makes sure sensitive information remains securely managed and protected.

The multilingual embedding model groups text with similar meanings by assigning them positions in the semantic vector space that are close to each other. Developers can process text in multiple languages without switching between different models. This makes processing more efficient and improves performance for multilingual applications.

Text embeddings turn unstructured data into a structured form. This allows you to objectively compare, dissect, and derive insights from all these documents. Cohere’s new embedding models have a new required input parameter, input_type, which must be set for every API call and include one of the following four values, which align towards the most frequent use cases for text embeddings:

  • input_type=”search_document” – Use this for texts (documents) you want to store in your vector database
  • input_type=”search_query” – Use this for search queries to find the most relevant documents in your vector database
  • input_type=”classification” – Use this if you use the embeddings as input for a classification system
  • input_type=”clustering” – Use this if you use the embeddings for text clustering

Using these input types provides the highest possible quality for the respective tasks. If you want to use the embeddings for multiple use cases, we recommend using input_type="search_document".

Prerequisites

To use the Claude 3 Sonnet LLM and the Cohere multilingual embeddings model on this dataset, ensure that you have access to the models in your AWS account under Amazon Bedrock, Model Access section and then proceed with installing the following packages. The following code has been tested to work with the Amazon SageMaker Data Science 3.0 Image, backed by an ml.t3.medium instance.

! apt-get update 
! apt-get install build-essential -y # for the hnswlib package below
! pip install hnswlib

Create a search index

With all of the prerequisites in place, you can now convert the multilingual corpus into embeddings and store those in hnswlib, a header-only C++ Hierarchical Navigable Small Worlds (HNSW) implementation with Python bindings, insertions, and updates. HNSWLib is an in-memory vector store that can be saved to a file, which should be sufficient for the small dataset we are working with. Use the following code:

import hnswlib
import os
import json
import botocore
import boto3

boto3_bedrock = boto3.client('bedrock')
bedrock_runtime = boto3.client('bedrock-runtime')

# Create a search index
index = hnswlib.Index(space='ip', dim=1024)
index.init_index(max_elements=10000, ef_construction=512, M=64)

all_text = stacked_df['text'].to_list()
all_text_lang = stacked_df['language'].to_list()

Embed and index documents

To embed and store the small multilingual dataset, use the Cohere embed-multilingual-v3.0 model, which creates embeddings with 1,024 dimensions, using the Amazon Bedrock runtime API:

modelId="cohere.embed-multilingual-v3"
contentType= "application/json"
accept = "*/*"


df_chunk_size = 80
chunk_embeddings = []
for i in range(0,len(all_text), df_chunk_size):
    chunk = all_text[i:i+df_chunk_size]
    body=json.dumps(
            {"texts":chunk,"input_type":"search_document"} # search documents
    ) 
    response = bedrock_runtime.invoke_model(body=body, 
                                            modelId=modelId,
                                            accept=accept,
                                            contentType=contentType)
    response_body = json.loads(response.get('body').read())
    index.add_items(response_body['embeddings'])

Verify that the embeddings work

To test the solution, write a function that takes a query as input, embeds it, and finds the top N documents most closely related to it:

# Retrieval of closest N docs to query
def retrieval(query, num_docs_to_return=10):
    modelId="cohere.embed-multilingual-v3"
    contentType= "application/json"
    accept = "*/*"
    body=json.dumps(
            {"texts":[query],"input_type":"search_query"} # search query
    ) 
    response = bedrock_runtime.invoke_model(body=body, 
                                            modelId=modelId,
                                            accept=accept,
                                            contentType=contentType)
    response_body = json.loads(response.get('body').read())
    doc_ids = index.knn_query(response_body['embeddings'], 
                              k=num_docs_to_return)[0][0] 
    print(f"Query: {query} n")
    retrieved_docs = []

    for doc_id in doc_ids:
        # Append results
        retrieved_docs.append(all_text[doc_id]) # original vernacular language docs

        # Print results
        print(f"Original Flores Text {all_text[doc_id]}")
        print("-"*30)

    print("END OF RESULTS nn")
    return retrieved_docs   

You can explore what the RAG stack does with a couple of queries in different languages, such as Hindi:

queries = [
    "मुझे सिंधु नदी घाटी सभ्यता के बारे में बताइए","
]
# translation: tell me about Indus Valley Civilization
for query in queries:
    retrieval(query)

The index returns documents relevant to the search query from across languages:

Query: मुझे सिंधु नदी घाटी सभ्यता के बारे में बताइए 

Original Flores Text सिंधु घाटी सभ्यता उत्तर-पश्चिम भारतीय उपमहाद्वीप में कांस्य युग की सभ्यता थी जिसमें आस-पास के आधुनिक पाकिस्तान और उत्तर पश्चिम भारत और उत्तर-पूर्व अफ़गानिस्तान के कुछ क्षेत्र शामिल थे.
------------------------------
Original Flores Text सिंधु नदी के घाटों में पनपी सभ्यता के कारण यह इसके नाम पर बनी है.
------------------------------
Original Flores Text यद्यपि कुछ विद्वानों का अनुमान है कि चूंकि सभ्यता अब सूख चुकी सरस्वती नदी के घाटियों में विद्यमान थी, इसलिए इसे सिंधु-सरस्वती सभ्यता कहा जाना चाहिए, जबकि 1920 के दशक में हड़प्पा की पहली खुदाई के बाद से कुछ इसे हड़प्पा सभ्यता कहते हैं।
------------------------------
Original Flores Text సింధు నది పరీవాహక ప్రాంతాల్లో నాగరికత విలసిల్లింది.
------------------------------
Original Flores Text सिंधू संस्कृती ही वायव्य भारतीय उपखंडातील कांस्य युग संस्कृती होती ज्यामध्ये  आधुनिक काळातील पाकिस्तान, वायव्य भारत आणि ईशान्य अफगाणिस्तानातील काही प्रदेशांचा समावेश होता.
------------------------------
Original Flores Text সিন্ধু সভ্যতা হল উত্তর-পশ্চিম ভারতীয় উপমহাদেশের একটি তাম্রযুগের সভ্যতা যা আধুনিক-পাকিস্তানের অধিকাংশ ও উত্তর-পশ্চিম ভারত এবং উত্তর-পূর্ব আফগানিস্তানের কিছু অঞ্চলকে ঘিরে রয়েছে।
-------------------------
 .....

You can now use these documents retrieved from the index as context while calling the Anthropic Claude 3 Sonnet model on Amazon Bedrock. In production settings with datasets that are several orders of magnitude larger than the Flores dataset, we can make the search results from the index even more relevant by using Cohere’s Rerank models.

Use the system prompt to outline how you want the LLM to process your query:

# Retrieval of docs relevant to the query
def context_retrieval(query, num_docs_to_return=10):

    modelId="cohere.embed-multilingual-v3"
    contentType= "application/json"
    accept = "*/*"
    body=json.dumps(
            {"texts":[query],"input_type":"search_query"} # search query
    ) 
    response = bedrock_runtime.invoke_model(body=body, 
                                            modelId=modelId,
                                            accept=accept,
                                            contentType=contentType)
    response_body = json.loads(response.get('body').read())
    doc_ids = index.knn_query(response_body['embeddings'], 
                              k=num_docs_to_return)[0][0] 
    retrieved_docs = []
    
    for doc_id in doc_ids:
        retrieved_docs.append(all_text[doc_id])
    return " ".join(retrieved_docs)

def query_rag_bedrock(query, model_id = 'anthropic.claude-3-sonnet-20240229-v1:0'):

    system_prompt = '''
    You are a helpful emphathetic multilingual assitant. 
    Identify the language of the user query, and respond to the user query in the same language. 

    For example 
    if the user query is in English your response will be in English, 
    if the user query is in Malayalam, your response will be in Malayalam, 
    if the user query is in Tamil, your response will be in Tamil
    and so on...

    if you cannot identify the language: Say you cannot idenitify the language

    You will use only the data provided within the <context> </context> tags, that matches the user's query's language, to answer the user's query
    If there is no data provided within the <context> </context> tags, Say that you do not have enough information to answer the question
    
    Restrict your response to a paragraph of less than 400 words avoid bullet points
    '''
    max_tokens = 1000

    messages  = [{"role": "user", "content": f'''
                    query : {query}
                    <context>
                    {context_retrieval(query)}
                    </context>
                '''}]

    body=json.dumps(
            {
                "anthropic_version": "bedrock-2023-05-31",
                "max_tokens": max_tokens,
                "system": system_prompt,
                "messages": messages
            }  
        )  


    response = bedrock_runtime.invoke_model(body=body, modelId=model_id)
    response_body = json.loads(response.get('body').read())
    return response_body['content'][0]['text']

Let’s pass in the same query in multiple Indian languages:

queries = ["tell me about the indus river valley civilization",
           "मुझे सिंधु नदी घाटी सभ्यता के बारे में बताइए",
           "मला सिंधू नदीच्या संस्कृतीबद्दल सांगा",
           "సింధు నది నాగరికత గురించి చెప్పండి",
           "ಸಿಂಧೂ ನದಿ ಕಣಿವೆ ನಾಗರಿಕತೆಯ ಬಗ್ಗೆ ಹೇಳಿ", 
           "সিন্ধু নদী উপত্যকা সভ্যতা সম্পর্কে বলুন",
           "சிந்து நதி பள்ளத்தாக்கு நாகரிகத்தைப் பற்றி சொல்",
           "സിന്ധു നദീതാഴ്വര നാഗരികതയെക്കുറിച്ച് പറയുക"] 

for query in queries:
    print(query_rag_bedrock(query))
    print('_'*20)

The query is in English, so I will respond in English.

The Indus Valley Civilization, also known as the Harappan Civilization, was a Bronze Age civilization that flourished in the northwestern regions of the Indian subcontinent, primarily in the basins of the Indus River and its tributaries. It encompassed parts of modern-day Pakistan, northwest India, and northeast Afghanistan. While some scholars suggest calling it the Indus-Sarasvati Civilization due to its presence in the now-dried-up Sarasvati River basin, the name "Indus Valley Civilization" is derived from its development along the Indus River valley. This ancient civilization dates back to around 3300–1300 BCE and was one of the earliest urban civilizations in the world. It was known for its well-planned cities, advanced drainage systems, and a writing system that has not yet been deciphered.
____________________
सिंधु घाटी सभ्यता एक प्राचीन नगर सभ्यता थी जो उत्तर-पश्चिम भारतीय उपमहाद्वीप में फैली हुई थी। यह लगभग 3300 से 1300 ईसा पूर्व की अवधि तक विकसित रही। इस सभ्यता के केंद्र वर्तमान पाकिस्तान के सिंध और पंजाब प्रांतों में स्थित थे, लेकिन इसके अवशेष भारत के राजस्थान, गुजरात, मध्य प्रदेश, महाराष्ट्र और उत्तर प्रदेश में भी मिले हैं। सभ्यता का नाम सिंधु नदी से लिया गया है क्योंकि इसके प्रमुख स्थल इस नदी के किनारे स्थित थे। हालांकि, कुछ विद्वानों का अनुमान है कि सरस्वती नदी के किनारे भी इस सभ्यता के स्थल विद्यमान थे इसलिए इसे सिंधु-सरस्वती सभ्यता भी कहा जाता है। यह एक महत्वपूर्ण शहरी समाज था जिसमें विकसित योजना बनाने की क्षमता, नगरीय संरचना और स्वच्छ जलापूर्ति आदि प्रमुख विशेषताएं थीं।
____________________
सिंधू संस्कृती म्हणजे सिंधू नदीच्या पट्टीकेतील प्राचीन संस्कृती होती. ही संस्कृती सुमारे ई.पू. ३३०० ते ई.पू. १३०० या कालखंडात फुलणारी होती. ती भारतातील कांस्ययुगीन संस्कृतींपैकी एक मोठी होती. या संस्कृतीचे अवशेष आजच्या पाकिस्तान, भारत आणि अफगाणिस्तानमध्ये आढळून आले आहेत. या संस्कृतीत नगररचना, नागरी सोयी सुविधांचा विकास झाला होता. जलवाहिनी, नगरदेवालय इत्यादी अद्भुत बाबी या संस्कृतीत होत्या. सिंधू संस्कृतीत लिपीसुद्धा विकसित झाली होती परंतु ती अजूनही वाचण्यास आलेली नाही. सिंधू संस्कृती ही भारतातील पहिली शहरी संस्कृती मानली जाते.
____________________
సింధు నది నాగరికత గురించి చెప్పుతూ, ఈ నాగరికత సింధు నది పరిసర ప్రాంతాల్లో ఉన్నదని చెప్పవచ్చు. దీనిని సింధు-సరస్వతి నాగరికత అనీ, హరప్ప నాగరికత అనీ కూడా పిలుస్తారు. ఇది ఉత్తర-ఆర్య భారతదేశం, ఆధునిక పాకిస్తాన్, ఉత్తర-పశ్చిమ భారతదేశం మరియు ఉత్తర-ఆర్థిక అఫ్గానిస్తాన్ కు చెందిన తామ్రయుగపు నాగరికత. సరస్వతి నది పరీవాహక ప్రాంతాల్లోనూ నాగరికత ఉందని కొందరు పండితులు అభిప్రాయపడ్డారు. దీని మొదటి స్థలాన్ని 1920లలో హరప్పాలో త్రవ్వారు. ఈ నాగరికతలో ప్రశస్తమైన బస్తీలు, నగరాలు, మలిచ్చి రంగులతో నిర్మించిన భవనాలు, పట్టణ నిర్మాణాలు ఉన్నాయి.
____________________
ಸಿಂಧೂ ಕಣಿವೆ ನಾಗರಿಕತೆಯು ವಾಯುವ್ಯ ಭಾರತದ ಉಪಖಂಡದಲ್ಲಿ ಕಂಚಿನ ಯುಗದ ನಾಗರಿಕತೆಯಾಗಿದ್ದು, ಪ್ರಾಚೀನ ಭಾರತದ ಇತಿಹಾಸದಲ್ಲಿ ಮುಖ್ಯವಾದ ಪಾತ್ರವನ್ನು ವಹಿಸಿದೆ. ಈ ನಾಗರಿಕತೆಯು ಆಧುನಿಕ-ದಿನದ ಪಾಕಿಸ್ತಾನ ಮತ್ತು ವಾಯುವ್ಯ ಭಾರತದ ಭೂಪ್ರದೇಶಗಳನ್ನು ಹಾಗೂ ಈಶಾನ್ಯ ಅಫ್ಘಾನಿಸ್ತಾನದ ಕೆಲವು ಪ್ರದೇಶಗಳನ್ನು ಒಳಗೊಂಡಿರುವುದರಿಂದ ಅದಕ್ಕೆ ಸಿಂಧೂ ನಾಗರಿಕತೆ ಎಂದು ಹೆಸರಿಸಲಾಗಿದೆ. ಸಿಂಧೂ ನದಿಯ ಪ್ರದೇಶಗಳಲ್ಲಿ ಈ ನಾಗರಿಕತೆಯು ವಿಕಸಿತಗೊಂಡಿದ್ದರಿಂದ ಅದಕ್ಕೆ ಸಿಂಧೂ ನಾಗರಿಕತೆ ಎಂದು ಹೆಸರಿಸಲಾಗಿದೆ. ಈಗ ಬತ್ತಿ ಹೋದ ಸರಸ್ವತಿ ನದಿಯ ಪ್ರದೇಶಗಳಲ್ಲಿ ಸಹ ನಾಗರೀಕತೆಯ ಅಸ್ತಿತ್ವವಿದ್ದಿರಬಹುದೆಂದು ಕೆಲವು ಪ್ರಾಜ್ಞರು ಶಂಕಿಸುತ್ತಾರೆ. ಆದ್ದರಿಂದ ಈ ನಾಗರಿಕತೆಯನ್ನು ಸಿಂಧೂ-ಸರಸ್ವತಿ ನಾಗರಿಕತೆ ಎಂದು ಸೂಕ್ತವಾಗಿ ಕರೆ
____________________
সিন্ধু নদী উপত্যকা সভ্যতা ছিল একটি প্রাচীন তাম্রযুগীয় সভ্যতা যা বর্তমান পাকিস্তান এবং উত্তর-পশ্চিম ভারত ও উত্তর-পূর্ব আফগানিস্তানের কিছু অঞ্চলকে নিয়ে গঠিত ছিল। এই সভ্যতার নাম সিন্ধু নদীর অববাহিকা অঞ্চলে এটির বিকাশের কারণে এরকম দেওয়া হয়েছে। কিছু পণ্ডিত মনে করেন যে সরস্বতী নদীর ভূমি-প্রদেশেও এই সভ্যতা বিদ্যমান ছিল, তাই এটিকে সিন্ধু-সরস্বতী সভ্যতা বলা উচিত। আবার কেউ কেউ এই সভ্যতাকে হরপ্পা পরবর্তী হরপ্পান সভ্যতা নামেও অবিহিত করেন। যাই হোক, সিন্ধু সভ্যতা ছিল প্রাচীন তাম্রযুগের এক উল্লেখযোগ্য সভ্যতা যা সিন্ধু নদী উপত্যকার এলাকায় বিকশিত হয়েছিল।
____________________
சிந்து நதிப் பள்ளத்தாக்கில் தோன்றிய நாகரிகம் சிந்து நாகரிகம் என்றழைக்கப்படுகிறது. சிந்து நதியின் படுகைகளில் இந்த நாகரிகம் மலர்ந்ததால் இப்பெயர் வழங்கப்பட்டது. ஆனால், தற்போது வறண்டுபோன சரஸ்வதி நதிப் பகுதியிலும் இந்நாகரிகம் இருந்திருக்கலாம் என சில அறிஞர்கள் கருதுவதால், சிந்து சரஸ்வதி நாகரிகம் என்று அழைக்கப்பட வேண்டும் என்று வாதிடுகின்றனர். மேலும், இந்நாகரிகத்தின் முதல் தளமான ஹரப்பாவின் பெயரால் ஹரப்பா நாகரிகம் என்றும் அழைக்கப்படுகிறது. இந்த நாகரிகம் வெண்கலயுக நாகரிகமாக கருதப்படுகிறது. இது தற்கால பாகிஸ்தானின் பெரும்பகுதி, வடமேற்கு இந்தியா மற்றும் வடகிழக்கு ஆப்கானிஸ்தானின் சில பகுதிகளை உள்ளடக்கியது.
____________________
സിന്ധു നദീതട സംസ്കാരം അഥവാ ഹാരപ്പൻ സംസ്കാരം ആധുനിക പാകിസ്ഥാൻ, വടക്ക് പടിഞ്ഞാറൻ ഇന്ത്യ, വടക്ക് കിഴക്കൻ അഫ്ഗാനിസ്ഥാൻ എന്നിവിടങ്ങളിൽ നിലനിന്ന ഒരു വെങ്കല യുഗ സംസ്കാരമായിരുന്നു. ഈ സംസ്കാരത്തിന്റെ അടിസ്ഥാനം സിന്ധു നദിയുടെ തടങ്ങളായതിനാലാണ് ഇതിന് സിന്ധു നദീതട സംസ്കാരം എന്ന പേര് ലഭിച്ചത്. ചില പണ്ഡിതർ ഇപ്പോൾ വറ്റിപ്പോയ സരസ്വതി നദിയുടെ തടങ്ങളിലും ഈ സംസ്കാരം നിലനിന്നിരുന്നതിനാൽ സിന്ധു-സരസ്വതി നദീതട സംസ്കാരമെന്ന് വിളിക്കുന്നത് ശരിയായിരിക്കുമെന്ന് അഭിപ്രായപ്പെടുന്നു. എന്നാൽ ചിലർ 1920കളിൽ ആദ്യമായി ഉത്ഖനനം നടത്തിയ ഹാരപ്പ എന്ന സ്ഥലത്തെ പേര് പ്രകാരം ഈ സംസ്കാരത്തെ ഹാരപ്പൻ സംസ്കാരമെന്ന് വിളിക്കുന്നു.

Conclusion

This post presented a walkthrough for using Cohere’s multilingual embedding model along with Anthropic Claude 3 Sonnet on Amazon Bedrock. In particular, we showed how the same question asked in multiple Indian languages, is getting answered using relevant documents retrieved from a vector store

Cohere’s multilingual embedding model supports over 100 languages. It removes the complexity of building applications that require working with a corpus of documents in different languages. The Cohere Embed model is trained to deliver results in real-world applications. It handles noisy data as inputs, adapts to complex RAG systems, and delivers cost-efficiency from its compression-aware training method.

Start building with Cohere’s multilingual embedding model and Anthropic Claude 3 Sonnet on Amazon Bedrock today.

References

[1] Flores Dataset: https://github.com/facebookresearch/flores/tree/main/flores200


About the Author

ronykroy

Rony K Roy is a Sr. Specialist Solutions Architect, Specializing in AI/ML. Rony helps partners build AI/ML solutions on AWS.

Read More

Optimization Without Retraction on the Random Generalized Stiefel Manifold

Optimization over the set of matrices X that satisfy X^TBX = Ip, referred to as the generalized Stiefel manifold, appears in many applications involving sampled covariance matrices such as the canonical correlation analysis (CCA), independent component analysis (ICA), and the generalized eigenvalue problem (GEVP). Solving these problems is typically done by iterative methods that require a fully formed B. We propose a cheap stochastic iterative method that solves the optimization problem while having access only to a random estimates of B. Our method does not enforce the constraint in every…Apple Machine Learning Research

How Far Can Transformers Reason? The Locality Barrier and Inductive Scratchpad

Can Transformers predict new syllogisms by composing established ones? More generally, what type of targets can be learned by such models from scratch? Recent works show that Transformers can be Turing-complete in terms of expressivity, but this does not address the learnability objective. This paper puts forward the notion of distribution locality to capture when weak learning is efficiently achievable by regular Transformers, where the locality measures the least number of tokens required in addition to the tokens histogram to correlate nontrivially with the target. As shown experimentally…Apple Machine Learning Research

Applying RLAIF for Code Generation with API-usage in Lightweight LLMs

This paper was accepted at the Natural Language Reasoning and Structured Explanations workshop at ACL 2024.
Reinforcement Learning from AI Feedback (RLAIF) has demonstrated significant potential across various domains, including mitigating harm in LLM outputs, enhancing text summarization, and mathematical reasoning. This paper introduces an RLAIF framework for improving the code generation abilities of lightweight (Apple Machine Learning Research

Revisiting Non-separable Binary Classification and its Applications in Anomaly Detection

The inability to linearly classify XOR has motivated much of deep learning. We revisit this age-old problem and show that linear classification of XOR is indeed possible. Instead of separating data between halfspaces, we propose a slightly different paradigm, equality separation, that adapts the SVM objective to distinguish data within or outside the margin. Our classifier can then be integrated into neural network pipelines with a smooth approximation. From its properties, we intuit that equality separation is suitable for anomaly detection. To formalize this notion, we introduce closing…Apple Machine Learning Research