First, using the demonstrations significantly outperforms the no demonstrations method
even with small k (k = 4), and performance drop
from using gold labels to using random labels is
consistently small across varying k, in the range of
0.8–1.6%.7
Interestingly, model performance does
not increase much as k increases when k ≥ 8, both
with gold labels and with random labels.
Following the findings from Min et al. (2022)(opens in a new tab), here are a few more tips about demonstrations/exemplars when doing few-shot: