You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"answer": "少样本提示 (few-shot prompting),推理与行动 (reasoning and acting),模仿学习 (imitation learning)",
"analyze": "论文 Section 4 'ALFWorld' 部分提到:'To prompt ReAct, we randomly annotate three trajectories from the training set for each task type... For baselines, we use BUTLER (Shridhar et al., 2020b), an imitation learning agent trained on 10^5 expert trajectories for each task type.' 这直接对比了 ReAct 和 BUTLER 的训练/学习方式。",