Hello, how can i get the probability for next sentence prediction using Bert?
Example gives arrays of different lengths for each 2 sentences, according to the number of tokens in this 2 sentences.
What should i do next?
Found this type and some examples in GSoC 2020. How to put it all together?
I’m new at this and hope for your help)
@chengchingwen
This code doesn’t work
using Transformers
using Transformers.Basic
using Transformers.Pretrain
using Transformers.HuggingFace
ENV["DATADEPS_ALWAYS_ACCEPT"] = true
#bert_model not used
bert_model, wordpiece, tokenizer = pretrain"bert-uncased_L-12_H-768_A-12"
vocab = Vocabulary(wordpiece)
model = hgf"bert-base-uncased:fornextsentenceprediction"
text1 = "Aesthetic Appreciation and Spanish Art:" |> tokenizer |> wordpiece
text2 = "Insights from Eye-Tracking" |> tokenizer |> wordpiece
formatted_text = ["[CLS]"; text1; "[SEP]"; text2; "[SEP]"]
token_indices = vocab(formatted_text)
segment_indices = [fill(1, length(text1)+2); fill(2, length(text2)+1)]
model(token_indices, segment_indices)
ERROR: DimensionMismatch(“arrays could not be broadcast to a common size; got a dimension with lengths 14 and 768”)