Changes

← Older edit

Isaac's Algorithm Notes

30 bytes added, 00:55, 22 May 2024

→‎Other Notes

* '''4. Decoding:''' Begin with start-of-sequence token. Model generates probabilities for the next token in each sequence and passes them through an output layer, including a softmax function that normalizes probabilities. Beam Search chooses which paths to continue following and prunes the rest.

* '''5. Finalize Output:''' Select Beam with highest probability as final translation, convert tokenized output back into a readable sentence with punctuation, capitalization, etc.

~~Now, to break down the steps of Beam Search~~

Conditional Probability Notes:

* After the algorithm is finished we take the best option from the candidate sequences as our output

If we have at most <math>m</math> words in our final sentence, each state has <math>n</math> word options, and we maintain a maximum bin width size of <math>k</math>, this algorithm runs with a time complexity of <math>O(nmk log(k))</math>

[[File:Screenshot 2024-05-21 at 7.50.48 PM.png|thumb]]

* lower value => faster runtime, less diversity, less optimal results

=== Other Notes ====

* '''Softmax Function: '''

[[File:Screenshot 2024-05-21 at 6.56.45 PM.png|thumb]]

* ensures that all probabilities at each state are in the range (0,1) and sum to 1'''Log semiring trick'''* Using ~~semi-~~[https://en.wikipedia.org/wiki/Log_semiring log ~~plot~~ semiring] to deal with very small probability values trick: Given <math>P</math> as the cumulative probability score of all words in the sequence so far, <math>P=p_0*p_1*p_2...p_n</math>

* Since the value of <math>P</math> can become very small, we can run into computational rounding errors. One strategy around this is to take the natural log of the summation of all p values and use this to compare values of <math>P</math>. This works because

* <math>P=p_0*p_1*p_2...p_n</math>

* <math>log(P)=log(p_0)+log(p_1)+log(p_2)...+log(p_n)</math>

* if <math>(P_1>P_2) => log(P_1)>log(P_2)</math>, so values can still be compared in this form

Ben

blockimmune, Bureaucrats, Administrators

488

edits

Changes

Isaac's Algorithm Notes

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools