• 0 Posts
  • 4 Comments
Joined 1 year ago
cake
Cake day: June 29th, 2023

help-circle
  • If I am not mistaken, the difference was that the Internet Archive was distributing books with a DRM that would make the PDF unusable after a certain time. You could relate it to how a physical library offers books for a limited time, for free. Now, of course, one could bypass the DRM or copy the contents differently, but so can another person photocopy a book they borrowed physically. Meanwhile, other physical libraries are allowed to distribute e-books, but I’m not sure if that’s made possible due to licensing fees.

    I’m not saying that they approached this well, especially given the copyright laws in the US, but it was indeed a good thing for the normal person at the time. Too bad that the judicial system in the US is biased towards leeching companies. I really can’t wait to see the AI vs publishers fight, though. Let’s see who has deeper pockets and better plants in the courts :D


  • You’re right. I read past the “I want to learn ML” and went straight to “do something useful with the data”.

    If the goal is to understand how modern LLMs work, it’s also good to read up on RNNs and LSTMs. For this, 3Blue1Brown does an amazing job, and even posted an in-depth video about transformers. I’d watch that next, followed by implementing a simple transformer in PyTorch (perhaps using the existing blocks).

    You could argue that it’s important to design everything from scratch first, but it’s easier to first go high level, see how the network behaves, and then attempt to implement it yourself based on the paper. It is up to OP how comfortable he is with the topic though 😁


  • Depending on how much compute you have available, you can look into finetuning models from HuggingFace (e.g. Llama 3, or a smaller Phi model). Look into LoRA, and try to learn how the model you choose calculates the loss.

    There are various ways to train, and usually involves masking the input by replacing random input tokens with the mask token. I won’t go into too much detail with this, because it’s a lot to explain, and I suggest you read an article on this (link1 or link2)