Next Article in Journal
Improving Scalable K-Means++
Next Article in Special Issue
Subpath Queries on Compressed Graphs: A Survey
Previous Article in Journal
A Discrete-Continuous Algorithm for Free Flight Planning
Previous Article in Special Issue
Computing Maximal Lyndon Substrings of a String
Open AccessArticle

Re-Pair in Small Space

1
M&D Data Science Center, Tokyo Medical and Dental University, Tokyo 113-8510, Japan
2
Kyushu Institute of Technology, Fukuoka 820-8502, Japan
3
Graduate School of IST, Hokkaido University, Hokkaido 060-0814, Japan
4
Fujitsu Laboratories Ltd., Kawasaki 211-8588, Japan
*
Author to whom correspondence should be addressed.
This paper is an extended version of our paper published in the Prague Stringology Conference 2020: Prague, Czech Republic, 31 August–2 September 2020 and at the Data Compression Conference 2020: Virtual Conference, 24–27 March 2020.
Received: 29 November 2020 / Revised: 18 December 2020 / Accepted: 18 December 2020 / Published: 25 December 2020
(This article belongs to the Special Issue Combinatorial Methods for String Processing)
Re-Pairis a grammar compression scheme with favorably good compression rates. The computation of Re-Pair comes with the cost of maintaining large frequency tables, which makes it hard to compute Re-Pair on large-scale data sets. As a solution for this problem, we present, given a text of length n whose characters are drawn from an integer alphabet with size σ=nO(1), an O(min(n2,n2lglogτnlglglgn/logτn)) time algorithm computing Re-Pair with max((n/c)lgn,nlgτ)+O(lgn) bits of working space including the text space, where c1 is a fixed user-defined constant and τ is the sum of σ and the number of non-terminals. We give variants of our solution working in parallel or in the external memory model. Unfortunately, the algorithm seems not practical since a preliminary version already needs roughly one hour for computing Re-Pair on one megabyte of text. View Full-Text
Keywords: grammar compression; Re-Pair; computation in small space; broadword techniques grammar compression; Re-Pair; computation in small space; broadword techniques
Show Figures

Figure 1

MDPI and ACS Style

Köppl, D.; I, T.; Furuya, I.; Takabatake, Y.; Sakai, K.; Goto, K. Re-Pair in Small Space. Algorithms 2021, 14, 5. https://0-doi-org.brum.beds.ac.uk/10.3390/a14010005

AMA Style

Köppl D, I T, Furuya I, Takabatake Y, Sakai K, Goto K. Re-Pair in Small Space. Algorithms. 2021; 14(1):5. https://0-doi-org.brum.beds.ac.uk/10.3390/a14010005

Chicago/Turabian Style

Köppl, Dominik; I, Tomohiro; Furuya, Isamu; Takabatake, Yoshimasa; Sakai, Kensuke; Goto, Keisuke. 2021. "Re-Pair in Small Space" Algorithms 14, no. 1: 5. https://0-doi-org.brum.beds.ac.uk/10.3390/a14010005

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop