Instead of using character ai, which will send all my private conversations to governments, I found this solution. Any thoughts on this? 😅

  • @fishynoob@infosec.pub
    link
    fedilink
    English
    010 days ago

    Thank you. I was going to try and host Ollama and Open WebUI. I think the problem is to find a source for pretrained/finetuned models which provide such… Interaction. Does huggingface have such pre-trained models? Any suggestions?

    • @Naz@sh.itjust.works
      link
      fedilink
      English
      0
      edit-2
      10 days ago

      I don’t know what GPU you’ve got, but Lexi V2 is the best “small model” I’ve seen with emotions, that I can just cite from the top of my head.

      It tends to skew male and can be a little dark at times, but it’s more complex than expected for the size (8B feels like 48-70B).

      Lexi V2 Original

      Lexi V2 GGUF Version

      Do Q8_0 if you’ve got the VRAM, Q5_KL for speed, IQ4_XS if you’ve got a potato.

      • @fishynoob@infosec.pub
        link
        fedilink
        English
        0
        edit-2
        10 days ago

        I was going to buy the ARC B580s when they come back down in price, but with the tariffs I don’t think I’ll ever see them at MSRP. Even the used market is very expensive. I’ll probably hold off on buying GPUs for a few more months till I can afford the higher prices/something changes. Thanks for the Lexi V2 suggestion

        • @Naz@sh.itjust.works
          link
          fedilink
          English
          0
          edit-2
          10 days ago

          If you are using CPU only, you need to look at very small models or the 2-bit quants.

          Everything will be extremely slow otherwise:

          GPU:

          Loaded Power: 465W

          Speed: 18.5 tokens/second

          CPU: Loaded Power: 115W

          Speed: 1.60 tokens/second

          GPUs are at least 3 times faster for the same power draw.