1 Comment

another way to make the llm output more readable is by making every new sentence starts at new line.

sample python code:

###############

currentdecoded = tokenizer_stream.decode(new_token)

if re.findall("^[\x2E\x3A\x3B]$", lastdecoded) and currentdecoded.startswith(" ") and (not currentdecoded.startswith(" *")) :

currentdecoded = "\n" + currentdecoded.replace(" ", "", 1)

print(currentdecoded, end='', flush=True)

lastdecoded = currentdecoded

###############

full code in

https://huggingface.co/zamroni111/Meta-Llama-3.1-8B-Instruct-ONNX-DirectML-GenAI-INT4/blob/main/onnxgenairun.py

Expand full comment