Know the rules The Paceline Forum Builder's Spotlight


Go Back   The Paceline Forum > General Discussion

Reply
 
Thread Tools Display Modes
  #46  
Old Yesterday, 03:59 AM
marciero marciero is online now
Senior Member
 
Join Date: Jun 2014
Location: Portland Maine
Posts: 3,322
Quote:
Originally Posted by verticaldoug View Post
Using Meta's claim for their latest model LLAMA 3.1 450B , they say they trained it on a cluster of 16,000 H100 which took 30.94mm GPU hours.

I think this equals 20Gw over 80 days

I think this is the consumption of 5300 homes annually.
Thats the training, where the all the model weights (these are the "billions of parameters") are updated each time through many passes through the data. Once trained these models can be deployed in production. Hugging face for example has hundreds of pretrained open source models.What is often done is to use pretrained models, freeze most of the weights and only train, say the top layers on data specific to your use case. That is much quicker. Or simply just predict using the entire model-no training. The point being that the training is more of a one off deal. Not that prediction is not computationally intense-you still need GPU to make it practical.

Last edited by marciero; Yesterday at 04:07 AM.
Reply With Quote
  #47  
Old Today, 01:29 PM
dddd dddd is offline
Senior Member
 
Join Date: May 2016
Posts: 2,284
I recently went through a nearly year-long process of getting (liquidated per court order) assets through a probate distribution, and the bank's personnel made several references/excuses how they had to work around their computer's algorithms (which seemed to be biased toward retention of liquid assets).

I imagine that with AI programming accelerating the evolution of the algorithms that satisfy their Wall Street shareholders, that an executor would likely die of old age before seeing any of the assets ever distributed.

I mentioned this to them several times to humor myself.
Reply With Quote
  #48  
Old Today, 03:08 PM
verticaldoug verticaldoug is offline
Senior Member
 
Join Date: Nov 2009
Posts: 3,448
Quote:
Originally Posted by dddd View Post
I recently went through a nearly year-long process of getting (liquidated per court order) assets through a probate distribution, and the bank's personnel made several references/excuses how they had to work around their computer's algorithms (which seemed to be biased toward retention of liquid assets).

I imagine that with AI programming accelerating the evolution of the algorithms that satisfy their Wall Street shareholders, that an executor would likely die of old age before seeing any of the assets ever distributed.

I mentioned this to them several times to humor myself.
I think if the algorithm did what you think it does, it is in violation of the fiduciary duty that executor has to the estate and the beneficiaries. There is a duty of care and impartiality, putting the bank's shareholder's interest in there would be a violation. I'd also say the bank's personnel need to be reminded of this as there comments strike me as unprofessional and a hand wave. I'd ask for a deeper explanation of exactly how and why they are working around the algorithm. That comment is a regulatory problem for them from a fiduciary point of view. By ignoring the algorithm's instructions, they are failing to follow instructions, therefore either the algorithm is giving faulty instructions or the banking staff is executing poor judgement.

Last edited by verticaldoug; Today at 03:18 PM.
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -5. The time now is 04:20 PM.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2024, vBulletin Solutions, Inc.