You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My idea is to gather enough data, then based on the most frequently utilized layers/experts, load those on the GPU. My workload is mostly programming, so I figure all layers/experts won't be frequently activated. If I can narrow down the ones that are more likely to be used for programming, I can have those loaded in GPU first. Doing this might yield some gains. Anyone know how to go about this?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
My idea is to gather enough data, then based on the most frequently utilized layers/experts, load those on the GPU. My workload is mostly programming, so I figure all layers/experts won't be frequently activated. If I can narrow down the ones that are more likely to be used for programming, I can have those loaded in GPU first. Doing this might yield some gains. Anyone know how to go about this?
Beta Was this translation helpful? Give feedback.
All reactions