Releases · mistralai/mistral-common

30 Nov 22:02

juliendenize

v1.8.6

c9c189c

v1.8.6: rm Python 3.9, bug fixes. Latest

Latest

What's Changed

Remove deprecated imports in docs. by @juliendenize in #138
Add normalizer and validator utils by @juliendenize in #140
Refactor private aggregate messages for InstructRequestNormalizer by @juliendenize in #141
test: improve unit test for is_opencv_installed by @PrasanaaV in #143
Optimize spm decode function by @juliendenize in #144
Add get_one_valid_tokenizer_file by @juliendenize in #142
Remove Python 3.9 support by @juliendenize in #145
Correctly pass revision and token to hf_api by @juliendenize in #149
Fix assertion in test_convert_text_chunk and tool_call by @patrickvonplaten in #152
Pins GH actions by @arcanis in #160
Add usage restrictions regarding third-party rights. by @juliendenize in #161
Improve tekken logging message for vocabulary by @juliendenize in #162
Set version 1.8.6 by @juliendenize in #151

New Contributors

@PrasanaaV made their first contribution in #143
@arcanis made their first contribution in #160

Full Changelog: v1.8.5...v1.8.6

Contributors

arcanis, patrickvonplaten, and 2 other contributors

Assets 2

12 Sep 06:50

juliendenize

v1.8.5

5921a03

v1.8.5: Patch Release

What's Changed

Make model field optional in TranscriptionRequest by @juliendenize in #128
Remove all responses and embedding requests. Add transcription docs. by @juliendenize in #133
Add chunk file by @juliendenize in #129
allow message content to be empty string by @mingfang in #135
Add test empty content for AssistantMessage v7 by @juliendenize in #136
v1.8.5 by @juliendenize in #137

New Contributors

@mingfang made their first contribution in #135

Full Changelog: v1.8.4...v1.8.5

Contributors

mingfang and juliendenize

Assets 2

20 Aug 07:26

juliendenize

v1.8.4

51e3728

v1.8.4: optional dependencies and allow random padding on ChatCompletionResponseStreamResponse

What's Changed

Update experimental.md by @juliendenize in #124
Make sentencepiece optional and refactor optional imports by @juliendenize in #126
Improve UX for contributing by @juliendenize in #127
feat: allow random padding on ChatCompletionResponseStreamResponse by @aac228 in #131

New Contributors

@aac228 made their first contribution in #131

Full Changelog: v1.8.3...v1.8.4

Contributors

juliendenize and aac228

Assets 2

25 Jul 16:16

juliendenize

v1.8.3

f47fb16

v1.8.3: Add an experimental REST API

What's Changed

Add a FastAPI app by @juliendenize in #113

We released an experimental REST API leveraging Fast API to handle requests from tokenization, through generation via calls to an engine, to detokenization.

For a detailed documentation see [https://mistralai.github.io/mistral-common/usage/experimental/].

Here is how to launch the server:

pip install mistral-common[server]

mistral_common serve mistralai/Magistral-Small-2507 \
--host 127.0.0.1 --port 8000 \
--engine-url http://127.0.0.1:8080 --engine-backend llama_cpp \
--timeout 60

Then you can see the Swagger at: http://localhost:8000.

Full Changelog: v1.8.2...v1.8.3

Contributors

juliendenize

Assets 2

24 Jul 08:42

juliendenize

v1.8.2

4b55642

v1.8.2: Add ThinkChunk

What's Changed

Add think chunk by @juliendenize in #122

Now you can use TextChunk and ThinkChunk in your SystemMessage or AssistantMessage:

from mistral_common.protocol.instruct.messages import SystemMessage, TextChunk, ThinkChunk

system_message = SystemMessage(
    content = [
        TextChunk(text="First draft your thinking process (inner monologue) until you arrive at a response. Format your response using Markdown, and use LaTeX for any mathematical equations. Write both your thoughts and the response in the same language as the input.\n\nYour thinking process must follow the template below:"),
        ThinkChunk(
            thinking="Your thoughts or/and draft, like working through an exercise on scratch paper. Be as casual and as long as you want until you are confident to generate the response. Use the same language as the input.",
            closed=True,
        ),
        TextChunk(text="Here, provide a self-contained response.")
    ],
)

Full Changelog: v1.8.1...v1.8.2

Contributors

juliendenize

Assets 2

16 Jul 11:20

juliendenize

v1.8.1

a153ac8

v1.8.1: Add AudioURLChunk

What's Changed

Add AudioURLChunk by @juliendenize in #120

Now you can use http(s) URLs, file paths and base64 string (without specifying format) in your content chunks thanks to AudioURLChunk !

from mistral_common.protocol.instruct.messages import AudioURL, AudioURLChunk, TextChunk, UserMessage
from mistral_common.protocol.instruct.request import ChatCompletionRequest
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer

repo_id = "mistralai/Voxtral-Mini-3B-2507"
tokenizer = MistralTokenizer.from_hf_hub(repo_id)

text_chunk = TextChunk(
    text="Wat do you think about this audio?"
)
user_msg = UserMessage(
    content=[
        AudioURLChunk(audio_url=AudioURL(url="https://freewavesamples.com/files/Ouch-6.wav")),
        text_chunk,
    ]
)

request = ChatCompletionRequest(messages=[user_msg])
tokenized = tokenizer.encode_chat_completion(request)

# pass tokenized.tokens to your favorite audio model
print(tokenized.tokens)
print(tokenized.audios)

# print text to visually see tokens
print(tokenized.text)

Full Changelog: v1.8.0...v1.8.1

Contributors

juliendenize

Assets 2

15 Jul 09:39

patrickvonplaten

v1.8.0

d201d12

v1.8.0 - Mistral welcomes 📢

What's Changed

[Audio] Add audio by @patrickvonplaten in #119

Full Changelog: v1.7.0...v1.8.0

Audio chat example:

from mistral_common.protocol.instruct.messages import TextChunk, AudioChunk, UserMessage, AssistantMessage, RawAudio
from mistral_common.protocol.instruct.request import ChatCompletionRequest
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.audio import Audio
from huggingface_hub import hf_hub_download

repo_id = "mistralai/voxtral-mini"
tokenizer = MistralTokenizer.from_hf_hub(repo_id)

obama_file = hf_hub_download("patrickvonplaten/audio_samples", "obama.mp3", repo_type="dataset")
bcn_file = hf_hub_download("patrickvonplaten/audio_samples", "bcn_weather.mp3", repo_type="dataset")

def file_to_chunk(file: str) -> AudioChunk:
    audio = Audio.from_file(file, strict=False)
    return AudioChunk.from_audio(audio)

text_chunk = TextChunk(text="Which speaker do you prefer between the two? Why? How are they different from each other?")
user_msg = UserMessage(content=[file_to_chunk(obama_file), file_to_chunk(bcn_file), text_chunk]).to_openai()


request = ChatCompletionRequest(messages=[user_msg])
tokenized = tokenizer.encode_chat_completion(request)

# pass tokenized.tokens to your favorite audio model
print(tokenized.tokens)
print(tokenized.audios)

# print text to visually see tokens
print(tokenized.text)

Audio transcription example:

from mistral_common.protocol.transcription.request import TranscriptionRequest
from mistral_common.protocol.instruct.messages import RawAudio
from mistral_common.audio import Audio
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer

from huggingface_hub import hf_hub_download

repo_id = "mistralai/voxtral-mini"
tokenizer = MistralTokenizer.from_hf_hub(repo_id)

obama_file = hf_hub_download("patrickvonplaten/audio_samples", "obama.mp3", repo_type="dataset")
audio = Audio.from_file(obama_file, strict=False)

audio = RawAudio.from_audio(audio)
request = TranscriptionRequest(model=repo_id, audio=audio, language="en")

tokenized = tokenizer.encode_transcription(request)

# pass tokenized.tokens to your favorite audio model
print(tokenized.tokens)
print(tokenized.audios)

# print text to visually see tokens
print(tokenized.text)

Contributors

patrickvonplaten

Assets 2

10 Jul 07:31

juliendenize

v1.7.0

6e4070b

v1.7.0 - v13 instruct tokenizer, rename multi-modal to image

What's Changed

[Naming] Rename multi-modal to image by @patrickvonplaten in #114
Add v13 Tokenizer by @juliendenize in #116
1.7.0 Release by @patrickvonplaten in #118

Full Changelog: v1.6.3...v1.7.0

Contributors

patrickvonplaten and juliendenize

Assets 2

10 Jul 07:29

juliendenize

v1.6.3

8084852

v1.6.3 - Improved from_hf_hub, support multiprocessing, ...

What's Changed

Improve hf hub support by @juliendenize in #95
Fix the Python badge by @juliendenize in #96
[Build system] Ensure UV reads more than just py files by @patrickvonplaten in #97
Update images.md by @juliendenize in #98
Improve decode and deprecate to_string by @juliendenize in #99
Fix string formatting for ConnectionError by @gaby in #101
Fix string formatting for NotImplementedError() by @gaby in #103
Fix error message instructions in transform_image() by @gaby in #102
Fix spelling issues across repo by @gaby in #107
Improve integration with HF by @juliendenize in #104
Opening tekkenizer file with utf-8 and remove deprecation warning by @juliendenize in #110
fix: multiprocessing pickle error with tokenizer by @NanoCode012 in #111

New Contributors

@gaby made their first contribution in #101
@NanoCode012 made their first contribution in #111

Full Changelog: v1.6.0...v1.6.3

Contributors

gaby, NanoCode012, and 2 other contributors

Assets 2

12 Jun 15:26

patrickvonplaten

v1.6.2

97a9b6e

Patch release: v1.6.2

Ensure that pypi version includes tokenizer files.

See: [BUG: data directory not installed for 1.6.0

Assets 2

Releases: mistralai/mistral-common

v1.8.6: rm Python 3.9, bug fixes.

What's Changed

New Contributors

Contributors

Uh oh!

v1.8.5: Patch Release

What's Changed

New Contributors

Contributors

Uh oh!

v1.8.4: optional dependencies and allow random padding on ChatCompletionResponseStreamResponse

What's Changed

New Contributors

Contributors

Uh oh!

v1.8.3: Add an experimental REST API

What's Changed

Contributors

Uh oh!

v1.8.2: Add ThinkChunk

What's Changed

Contributors

Uh oh!

v1.8.1: Add AudioURLChunk

What's Changed

Contributors

Uh oh!

v1.8.0 - Mistral welcomes 📢

What's Changed

Audio chat example:

Audio transcription example:

Contributors

Uh oh!

v1.7.0 - v13 instruct tokenizer, rename multi-modal to image

What's Changed

Contributors

Uh oh!

v1.6.3 - Improved from_hf_hub, support multiprocessing, ...

What's Changed

New Contributors

Contributors

Uh oh!

Patch release: v1.6.2

Uh oh!