[GCJ 2014 #2] Trie Sharding

ID: 15125

远端评测题

3000~5000ms

1024MiB

尝试: 0

已通过: 0

难度: 8

上传者:

Hydro

标签>

2014

树形 DP

组合数学

字典树 Trie

Google Code Jam

题目描述

A set of strings $\mathbf{S}$ can be stored efficiently in a trie. A trie is a rooted tree that has one node for every prefix of every string in $\mathbf{S}$ , without duplicates.

For example, if $\mathbf{S}$ were "AAA", "AAB", "AB", "B", the corresponding trie would contain 7 nodes corresponding to the prefixes "", "A", "AA", AAA", "AAB", "AB", and "B".

I have a server that contains $\mathbf{S}$ in one big trie. Unfortunately, $\mathbf{S}$ has become very large, and I am having trouble fitting everything in memory on one server. To solve this problem, I want to switch to storing $\mathbf{S}$ across $\mathbf{N}$ separate servers. Specifically, $\mathbf{S}$ will be divided up into disjoint, non-empty subsets $\mathbf{T}_{1}, \mathbf{T}_{2}, \ldots, \mathbf{T}_{\mathbf{N}}$, and on each server i, I will build a trie containing just the strings in $\mathbf{T}_{\mathbf{i}}$ . The downside of this approach is the total number of nodes across all $\mathbf{N}$ tries may go up. To make things worse, I can't control how the set of strings is divided up!

For example, suppose "AAA", "AAB", "AB", "B" are split into two servers, one containing "AAA" and "B", and the other containing "AAB", "AB". Then the trie on the first server would need 5 nodes ("", "A", "AA", "AAA", "B"), and the trie on the second server would also need 5 nodes ("", "A", "AA", "AAB", "AB"). In this case, I will need 10 nodes altogether across the two servers, as opposed to the 7 nodes I would need if I could put everything on just one server.

Given an assignment of strings to $\mathbf{N}$ servers, I want to compute the worst-case total number of nodes across all servers, and how likely it is to happen. I can then decide if my plan is good or too risky.

Given $\mathbf{S}$ and $\mathbf{N}$ , what is the largest number of nodes that I might end up with? Additionally, how many ways are there of choosing $\mathbf{T}_{1}, \mathbf{T}_{2}, \ldots, \mathbf{T}_{\mathbf{N}}$ for which the number of nodes is maximum? Note that the $\mathbf{N}$ servers are different -- if a string appears in $\mathbf{T}_{\mathbf{i}}$ in one arrangement and in $\mathbf{T}_{\mathbf{j}}$ ( $i \neq j$ ) in another arrangement, then the two arrangements are considered different. Print the remainder of the number of possible arrangements after division by 1,000,000,007.

输入格式

The first line of the input gives the number of test cases, $\mathbf{T}$ . $\mathbf{T}$ test cases follow. Each test case starts with a line containing two space-separated integers: $\mathbf{M}$ and $\mathbf{N}$ . $\mathbf{M}$ lines follow, each containing one string in $\mathbf{S}$ .

输出格式

For each test case, output one line containing "Case #i: $\mathbf{X} \mathbf{Y}$ ", where i is the case number (starting from 1), $\mathbf{X}$ is the worst-case number of nodes in all the tries combined, and $\mathbf{Y}$ is the number of ways (modulo 1,000,000,007) to assign strings to servers such that the number of nodes in all $\mathbf{N}$ servers are $\mathbf{X}$ .

2
4 2
AAA
AAB
AB
B
5 2
A
B
C
D
E

Case #1: 10 8
Case #2: 7 30

提示

Limits

$1 \leq T \leq 100$ .
Strings in $\mathbf{S}$ will contain only upper case English letters.
The strings in $\mathbf{S}$ will all be distinct.
$\mathrm{N} \leq \mathrm{M}$

Small dataset(9 Pts)

Time limit: 60 3 seconds.
$1 \leq M \leq 8$
$1 \leq N \leq 4$
Each string in $\mathbf{S}$ will have between 1 and 10 characters, inclusive.

Large dataset(30 Pts)

Time limit: ~~120~~ 5 seconds.
$1 \leq M \leq 1000$
$1 \leq N \leq 100$
Each string in $\mathbf{S}$ will have between 1 and 100 characters, inclusive.

#P13259. [GCJ 2014 #2] Trie Sharding

题目描述

输入格式

输出格式

提示

还没有账户？

登录