OVHcloud AI endpoints are now in General Availability

OVHcloud officially announced last week the General Availability of the OVHcloud AI Endpoints. These make more than 40 models for text, speech, image and easily accessible behind ready-to-use API endpoints.

The AI endpoints are a very convenient, easy to use way to leverage powerful AI models for GenAI, text-to-speech, image recognition, image generation, working with code, and more, in a no-commitment pay-per-use pricing model.

Available Models at OVHcloud

OVHcloud are launching the AI endpoints with over 40 models. The available models are listed below.

LLM (Large Language Models)

High-performance templates designed to understand and generate text in a human way, used for chatbots, text summarization, or content creation.

Model	Description
Llama-3.1-8B-Instruct	Llama 3.1 (8B parameters version) is an auto-regressive language model that uses an optimized transformer architecture. It was released by Meta AI on July 23, 2024, and utilizes supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The knowledge cutoff for this model is December 31, 2023.
Llama-3.1-70B-Instruct	Llama 3.1 (70B parameters version) is an auto-regressive language model that uses an optimized transformer architecture. It was released by Meta AI on July 23, 2024, and utilizes supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The knowledge cutoff for this model is December 31, 2023.
Llama-3.3-70B-Instruct	Llama 3.3 is an instruction-tuned generative language model optimized for multilingual dialogue use cases. It was released by Meta AI on December 6, 2024, and utilizes an advanced transformer architecture and is designed to align with human preferences for helpfulness and safety through supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF). The knowledge cutoff for this 70B model is December 31, 2023.
Mistral-Nemo-Instruct-2407	The Mistral-Nemo-Instruct-2407 model, developed collaboratively by Mistral AI and NVIDIA, is an instruction-tuned LLM released in 2024. Designed for multilingual applications, it excels in tasks such as conversational dialogue, code generation, and instructional comprehension across various languages.

Reasoning LLM

Model	Description
DeepSeek-R1-Distill-Llama-70B	The DeepSeek-R1-Distill-Llama-70B model is a model trained via large-scale reinforcement learning. It was released by DeepSeek on January 20, 2025, and it is a distilled version of the Llama 3.3 70B model. The knowledge cutoff date for this model is July 1, 2024.

Visual LLM

Model	Description
Qwen2.5-VL-72B-Instruct	Qwen2.5-VL is a powerful vision-language model, designed for advanced image understanding. It can generate detailed image captions, analyze documents, OCR, detect objects, and answer questions based on visuals, making it useful for AI assistants, RAG and Agents.
llava-next-mistral-7b	LLaVa combines a pre-trained large language model with a pre-trained vision encoder for multimodal (image + text) chatbot use cases. LLaVa-NeXT (also known as LLaVa-1.6) improves upon the 1.5 series by incorporating higher image resolutions and more reasoning/OCR datasets.

Code LLM

Model	Description
Qwen2.5-Coder-32B-Instruct	Qwen2.5-VL is a powerful vision-language model, designed for advanced image understanding. It can generate detailed image captions, analyze documents, OCR, detect objects, and answer questions based on visuals, making it useful for AI assistants, RAG and Agents.
mamba-codestral-7B-v0.1	BGE-Multilingual-Gemma2 is a LLM-based multilingual embedding model. It is trained on a diverse range of languages and tasks. BGE-Multilingual-Gemma2 primarily demonstrates the following advancements: Diverse training data: The model’s training data spans a broad range of languages, including English, Chinese, Japanese, Korean, French, and more.Additionally, the data covers a variety of task types, such as retrieval, classification, and clustering. Outstanding performance: The model exhibits state-of-the-art (SOTA) results on multilingual benchmarks like MIRACL, MTEB-pl, and MTEB-fr. It also achieves excellent performance on other major evaluations, including MTEB, C-MTEB and AIR-Bench.

Model

Description

Qwen2.5-Coder-32B-Instruct

Qwen2.5-VL is a powerful vision-language model, designed for advanced image understanding. It can generate detailed image captions, analyze documents, OCR, detect objects, and answer questions based on visuals, making it useful for AI assistants, RAG and Agents.

mamba-codestral-7B-v0.1

BGE-Multilingual-Gemma2 is a LLM-based multilingual embedding model. It is trained on a diverse range of languages and tasks. BGE-Multilingual-Gemma2 primarily demonstrates the following advancements: Diverse training data: The model’s training data spans a broad range of languages, including English, Chinese, Japanese, Korean, French, and more.Additionally, the data covers a variety of task types, such as retrieval, classification, and clustering. Outstanding performance: The model exhibits state-of-the-art (SOTA) results on multilingual benchmarks like MIRACL, MTEB-pl, and MTEB-fr. It also achieves excellent performance on other major evaluations, including MTEB, C-MTEB and AIR-Bench.

Embeddings

Model	Description
bge-multilingual-gemma2	BGE-Multilingual-Gemma2 is a LLM-based multilingual embedding model. It is trained on a diverse range of languages and tasks. BGE-Multilingual-Gemma2 primarily demonstrates the following advancements: Diverse training data: The model’s training data spans a broad range of languages, including English, Chinese, Japanese, Korean, French, and more.Additionally, the data covers a variety of task types, such as retrieval, classification, and clustering. Outstanding performance: The model exhibits state-of-the-art (SOTA) results on multilingual benchmarks like MIRACL, MTEB-pl, and MTEB-fr. It also achieves excellent performance on other major evaluations, including MTEB, C-MTEB and AIR-Bench.
BGE-M3	The BGE-M3 model, developed by the BAAI and released in early 2024, is a powerful multilingual multifunctional and multigranular embedding model designed for retrieval tasks. It supports dense, multi-vector, and sparse retrieval while handling texts of varying lengths across over 100 languages.
bge-base-en-v1.5	This model converts English text into dense vector embeddings, facilitating tasks like semantic similarity search and information retrieval. It was released in September 2023 and was developed by the BAAI (the Beijing Academy of Artificial Intelligence).

Computer Vision

Model	Description
yolov11x-image-segmentation	This YOLO model, developed by Ultralytics, excels in real-time instance segmentation, enabling precise identification and delineation of objects within images.
yolov11x-object-detection	This YOLO model, developed by Ultralytics, is a state-of-the-art object detection model released in 2024, offering improved performance in real-time object detection across various image types.

Image Generation

Model	Description
stable-diffusion-xl-base-v10	SDXL, developed by Stability AI, is an advanced text-to-image model released in July 2023, offering enhanced image generation capabilities.

Natural Language Processing

Model	Description
roberta-base-go_emotions	This model, developed by Sam Lowe, is a fine-tuned version of RoBERTa for multi-label emotion classification, released in 2020. It is designed to identify 28 different emotions in text.
bart-large-cnn	The Bart large CNN model, published by Meta AI, is a fine-tuned version of BART specifically trained for summarization tasks using the CNN/Daily Mail dataset. It is one of the best pre-trained models available for abstractive text summarization.
bert-base-NER	This Bert base NER model is a fine-tuned version of BERT, developed by dslim. This model is specifically trained for Named Entity Recognition (NER) tasks, enabling it to identify and classify entities such as locations, organizations, persons, and miscellaneous entities within text. It was fine-tuned on the English version of the CoNLL-2003 Named Entity Recognition dataset.
bert-base-multilingual-uncased-sentiment	This Bert base model is a fine-tuned version of BERT, developed by NLP Town. This model is specifically trained for sentiment analysis across six languages: English, Dutch, German, French, Spanish, and Italian. It predicts the sentiment of a review on a scale of 1 to 5 stars.

Translation

Model	Description
t5-large	The T5-large model, developed by Google Research, is designed to translate English text into other languages.

Audio Analysis

Model	Description
nvr-tts-it-it	This NVIDIA TTS model generates natural-sounding Italian speech from raw text without requiring additional information.
nvr-tts-de-de	This NVIDIA TTS model generates natural-sounding German speech from raw text without requiring additional information.
nvr-tts-en-us	This NVIDIA TTS model generates natural-sounding American English speech from raw text without requiring additional information.
nvr-asr-en-gb	This NVIDIA ASR model allows you to recognize and transcribe British English audio speech into text. It is trained on diverse datasets to ensure high accuracy across different accents and domains.
nvr-asr-fr-fr	This NVIDIA ASR model allows you to recognize and transcribe European French audio speech into text. It is trained on diverse datasets to ensure high accuracy across different accents and domains.
nvr-asr-es-es	This NVIDIA ASR model allows you to recognize and transcribe European Spanish audio speech into text. It is trained on diverse datasets to ensure high accuracy across different accents and domains.
nvr-tts-es-es	This NVIDIA TTS model generates natural-sounding European Spanish speech from raw text without requiring additional information.
nvr-asr-en-us	This NVIDIA ASR model allows you to recognize and transcribe American English audio speech into text. It is trained on diverse datasets to ensure high accuracy across different accents and domains.

Pricing

OVHcloud AI endpoints are charged per 1m tokens. Below is a comparison with the comparable services of Azure AI, AWS, Scaleway and IONOS for the Llama 3.3 70B Instruct model.

Vendor	Model	Input Tokens €/1M	Output Tokens €/1M
IONOS AI Model Hub	Llama 3.3 70B Instruct	1,50 €	1,75 €
Scaleway Generative APIs	Llama 3.3 70B Instruct	0,90 €	0,90 €
OVHcloud AI Endpoints	Llama 3.3 70B Instruct	0,79 €	0,79 €
AWS Bedrock	Llama 3.3 70B Instruct	0,72 €	0,72 €
Azure AI	Llama 3.3 70B Instruct	0,268 €	0,354 €

Try the OVHcloud AI Endpoints

Check out the OVHcloud AI endpoints here: OVHcloud AI Endpoints Playground.

OVHcloud AI endpoints are now in General Availability

Available Models at OVHcloud

LLM (Large Language Models)

Reasoning LLM

Visual LLM

Code LLM

Embeddings

Computer Vision

Image Generation

Natural Language Processing

Translation

Audio Analysis

Pricing

Try the OVHcloud AI Endpoints

One Comment “OVHcloud AI endpoints are now in General Availability”

Leave a Comment Cancel

OVHcloud AI endpoints are now in General Availability

Available Models at OVHcloud

LLM (Large Language Models)

Reasoning LLM

Visual LLM

Code LLM

Embeddings

Computer Vision

Image Generation

Natural Language Processing

Translation

Audio Analysis

Pricing

Try the OVHcloud AI Endpoints

How to use Scaleway Generative APIs with N8N

OpenAI’s EU Economic Blueprint

European LLMs Teuken-7B-Base & Teuken-7B-Instruct aim to mitigate language bias

One Comment “OVHcloud AI endpoints are now in General Availability”

Leave a Comment Cancel