You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
model_file = os.path.join(shared_model_path, "tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf") and shared_model_path is to the readwrite dir(not the read dir) which could cause confusion for students down the line for those with similar set ups if taken out of context.
This notebook assumes that at least one ‘Small model’ file ending in .gguf has already been downloaded into a directory (see GPT4All_Download_gguf.ipynb for more). Should this be "see 1-2-HuggingFace_Hub_Download_gguf.ipynb for more"?
chat_format="chatml" # Qwen uses ChatML format suggests to be that the chat_format is determined by the model, but otherwise this was not apparent to me. This does end up being addressed later, so it could be moved up (probably not important) and/or an additional note about how to determine the proper format could be added there.
how a language model actually works with numbers -> how a language model actually works with numbers (embeddings)
Cells with ## 1. Environment Setup, ## 5. Inside a GGUF File, ## 7. How Concepts Map to Numbers: Token Embeddings, and ## 8. Putting It All Together: The Full Pipeline have --- at the beginning so they don't render correctly.
The first time you see a tokenizer break up words into smaller word pieces happens mid-way through the notebook with "comparative." Is there a quick way to incorporate this sooner? Even just showing that word tokenized by itself could be good.
I don't think I'm getting correct numbers printed in section 5.1 Reading GGUF Metadata when I run the notebook... Context length can't be 3, right?!
Similarly, this doesn't feel right (and I guess if it is, it could use additional explanatory text)
Is it worth it to mention A size (bytes on disk — much smaller than the raw float32 equivalent) in 5.3 if this isn't printed out?
Cell in 7.1doesn't run for me. Also you may want to silence the deprecation warning.
Next steps at the end of the notebook don't make sense. Inside_Small_Model.ipynb doesn't exist in the repo that is cloned and LlamaCpp_SmallLM_Demo.ipynb is numbered 2.1 (before this notebook 2.2)
Experiment 1 haiku cell doesn't run for me (sonnet ok): NotFoundError: Error code: 404 - {'type': 'error', 'error': {'type': 'not_found_error', 'message': 'model: claude-haiku-4-20250514'}, 'request_id': 'req_011CYsz6HZhhv1XJuj5STKya'}. Changing to haiku_model_id = "claude-3-haiku-20240307" worked for me to run the rest of the notebook.
Discussion of temperature slightly conflicts with nb 2.1, at least when it comes to setting temp to 1.
Cell that goes with Putting It All Together doesn't run for me: BadRequestError: Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'stop_sequences: each stop sequence must contain non-whitespace'}, 'request_id': 'req_011CYszyuv9rRMfYwWhotTnS'} Removing the stop_sequences parameter makes the cell run.
Cell that begins with # Send a chat message to GPT-4o-mini should be removed/moved down because it doesn't belong to the Checking Available Models section.
The two cells after the interactive widgit are incredibly similar. Do you need both? Can you provide more context for both/either?
Reflection checkpoints are mentioned at the beginning and have a weird framing out of context/on first read. It also appears that only Checkpoint 3 actually writes to an external file...
Tables of SAT questions include these columns visuals.typevisuals.svg_content that are all NaNs. I'd exclude these from output.
The dataframes that are created with pd.concat() seem to have rows with the same question twice.. what's up with that? Either more explanation is needed or these should be removed. I don't think the dataframes are even used..
Ask the model - English code cell doesn't define random_para_text. It should include random_para_text = random_entry["question"]["paragraph"]
Cells with ## Our Knowledge Base, ## Finding the Right Document, ### The Real Test: Different Words, Same Meaning, ### Visualizing Meaning Space, ## The Full Pipeline: Retrieve + Generate, ## Try It Yourself!, and ## Key Takeaways have --- at the beginning so they don't render correctly.
lslists all the files in a given dir in Step 1. You're trying to show that the dir exists happily, right?.envare capitalized and later you useopenai_API_KEY. It still loads, but weird inconsistency.model_file = os.path.join(shared_model_path, "tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf")andshared_model_pathis to the readwrite dir(not the read dir) which could cause confusion for students down the line for those with similar set ups if taken out of context.This notebook assumes that at least one ‘Small model’ file ending in .gguf has already been downloaded into a directory (see GPT4All_Download_gguf.ipynb for more).Should this be "see1-2-HuggingFace_Hub_Download_gguf.ipynbfor more"?chat_format="chatml" # Qwen uses ChatML formatsuggests to be that the chat_format is determined by the model, but otherwise this was not apparent to me. This does end up being addressed later, so it could be moved up (probably not important) and/or an additional note about how to determine the proper format could be added there.how a language model actually works with numbers->how a language model actually works with numbers (embeddings)## 1. Environment Setup,## 5. Inside a GGUF File,## 7. How Concepts Map to Numbers: Token Embeddings, and## 8. Putting It All Together: The Full Pipelinehave---at the beginning so they don't render correctly.5.1 Reading GGUF Metadatawhen I run the notebook... Context length can't be 3, right?!A size (bytes on disk — much smaller than the raw float32 equivalent)in 5.3 if this isn't printed out?7.1doesn't run for me.Inside_Small_Model.ipynbdoesn't exist in the repo that is cloned andLlamaCpp_SmallLM_Demo.ipynbis numbered 2.1 (before this notebook 2.2)NotFoundError: Error code: 404 - {'type': 'error', 'error': {'type': 'not_found_error', 'message': 'model: claude-haiku-4-20250514'}, 'request_id': 'req_011CYsz6HZhhv1XJuj5STKya'}. Changing tohaiku_model_id = "claude-3-haiku-20240307"worked for me to run the rest of the notebook.Putting It All Togetherdoesn't run for me:BadRequestError: Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'stop_sequences: each stop sequence must contain non-whitespace'}, 'request_id': 'req_011CYszyuv9rRMfYwWhotTnS'}Removing thestop_sequencesparameter makes the cell run.# Send a chat message to GPT-4o-minishould be removed/moved down because it doesn't belong to theChecking Available Modelssection.visuals.typevisuals.svg_contentthat are allNaNs. I'd exclude these from output.pd.concat()seem to have rows with the same question twice.. what's up with that? Either more explanation is needed or these should be removed. I don't think the dataframes are even used..Ask the model - Englishcode cell doesn't definerandom_para_text. It should includerandom_para_text = random_entry["question"]["paragraph"]## Our Knowledge Base,## Finding the Right Document,### The Real Test: Different Words, Same Meaning,### Visualizing Meaning Space,## The Full Pipeline: Retrieve + Generate,## Try It Yourself!, and## Key Takeawayshave---at the beginning so they don't render correctly.