Model size
Light/Dark
Click on each task to view more details. For more information, refer to SEA-BED: Southeast Asia Embedding Benchmark.
Classification
Learn a classifier over sentence embeddings to assign labels to individual sentences.
Metric: F1
Multi-label Classification
Predict multiple labels for each input text using a classifier trained on embeddings.
Pair Classification
Predict a binary relationship between two sentences based on their embedding similarity.
Metric: Average Precision (AP)
Semantic Textual Similarity (STS)
Score sentence pair similarity using embedding distance metrics.
Metric: Cosine Similarity
Clustering
Group semantically similar texts into clusters using k-means over embeddings.
Metric: V-measure
Bitext Mining
Identify cross-lingual translation pairs via nearest-neighbour embedding retrieval.
Retrieval
Retrieve relevant documents for a query using embedding similarity search.
Metric: NDCG@10
Instruction Retrieval
Retrieve documents using queries enriched with natural language relevance instructions.
Metric: NDCG@5
Reranking
Improve relevance ranking by reordering candidates using embedding similarity.
Metric: Mean Average Precision (MAP)