2022 IEEE 19th India Council International Conference (INDICON) 2022
DOI: 10.1109/indicon56171.2022.10040147
|View full text |Cite
|
Sign up to set email alerts
|

Analysis of Subword based Word Representations Case Study: Fasttext Malayalam

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 34 publications
0
1
0
Order By: Relevance
“…However, all six models share a similar architecture which is based on 1-D convolutional neural networks (CNNs) which take blocks of raw bytes as input and embed them into a trainable latent space. Shifting individual bytes into a latent space was inspired by the current state-of-the-art natural language processing models where words, or sub-words, are embedded into a common latent space before being sent through a neural network [30]- [34]. The use of byte embeddings instead of 1-hot encoding or hand-crafted features such as input is, arguably, one of the key insights offered by the FiFTy research paper.…”
Section: B Approaches To File Fragment Type Identificationmentioning
confidence: 99%
“…However, all six models share a similar architecture which is based on 1-D convolutional neural networks (CNNs) which take blocks of raw bytes as input and embed them into a trainable latent space. Shifting individual bytes into a latent space was inspired by the current state-of-the-art natural language processing models where words, or sub-words, are embedded into a common latent space before being sent through a neural network [30]- [34]. The use of byte embeddings instead of 1-hot encoding or hand-crafted features such as input is, arguably, one of the key insights offered by the FiFTy research paper.…”
Section: B Approaches To File Fragment Type Identificationmentioning
confidence: 99%