#P14095. [ICPC 2023 Seoul R] Tandem Copy

[ICPC 2023 Seoul R] Tandem Copy

题目描述

Tandem copy is an operation on a DNA where a consecutive sequence of one or more nucleotides is repeated, and the repetitions are directly adjacent to each other; in other words, the tandem copy operation makes a copy of a consecutive sequence of nucleotides and pastes the copy right after the copied sequence. For example, ATCATCG\texttt{ATCATCG} is resulted from the tandem copy of ATC\texttt{ATC} in ATCG\texttt{ATCG}. Furthermore, we can continue another tandem copy on the resulted sequence ATCATCG\texttt{ATCATCG} and obtain ATCATTCG\texttt{ATCATTCG}. The following example illustrates a series of tandem copies from ATCG\texttt{ATCG}, where the underlined sequence is copied at each step.

$$\texttt{\underline{ATC}G}\Rightarrow\texttt{ATCA\underline{T}CG}\Rightarrow\texttt{\underline{AT}CATTCG}\Rightarrow\texttt{ATATC\underline{ATT}CG}\Rightarrow\dots $$

We say that ATCG\texttt{ATCG} produces all these sequences by tandem copy. It is easy to see that ATCG\texttt{ATCG} can produce different sequences by selecting a different portion of the sequence to tandem copy at each step. Furthermore, in principle, ATCG\texttt{ATCG} can produce infinitely many sequences by continuing tandem copies as many as it needs.

Usually, it is more expensive to tandem copy a longer portion. For instance,

$$\texttt{\underline{ATC}G}\Rightarrow\texttt{ATCATCG} $$

is a tandem copy of three nucleotides and thus is more expensive than

$$\texttt{ATCA\underline{T}CG}\Rightarrow\texttt{ATCATTCG} , $$

which is a tandem copy of one nucleotide. In other words, the length of a copied portion at each step is crucial to determine the tandem copy cost.

Since it is easy to tandem copy a single nucleotide, it is practical for the ICPC lab to store sequences such that two consecutive nucleotides in a sequence is always different; this helps the lab to reduce the storage space. For instance, since ATTTG\texttt{ATTTG} can be produced by tandem copying T\texttt{T} twice from ATG\texttt{ATG}, it is better for the lab to only store the shorter sequence ATG\texttt{ATG} instead of ATTTG\texttt{ATTTG}.

Because of a recent budget cut, the ICPC lab can only perform the tandem copy on at most two nucleotides at one time; namely, the length of a copied portion is at most two at each step. On the other hand, the lab can continue to repeat the tandem copy as many as it desires. For example, given a sequence ABCD\texttt{ABCD}, we can apply the tandem copy operation on B\texttt{B} and obtain ABBCD\texttt{ABBCD}, or apply it on the sequence BC\texttt{BC} and obtain ABCBCD\texttt{ABCBCD}. But we cannot tandem copy the consecutive sequence ABC\texttt{ABC} because its length is longer than two.

Given a source string ss and a target string tt, your task is to count the number of all valid substrings ss' of ss, where one can obtain a string xx from ss' by applying an appropriate number of the tandem copy operations such that xx contains tt as a substring. Please note that no two consecutive nucleotides in the source string are the same, whereas two consecutive nucleotides in the target string can be the same. For example, CCA\texttt{CCA} or ATTGC\texttt{ATTGC} cannot be source strings, but they can be target strings.

Now, given s=ACATGCATs=\texttt{ACATGCAT} and t=CCACATTTt=\texttt{CCACATTT}, we take a substring s=CATGCs'=\texttt{CATGC} of ss and run a series of tandem copies as follows:

$$s'=\texttt{\underline{C}ATGC}\Rightarrow\texttt{C\underline{CA}TGC}\Rightarrow\texttt{CCACA\underline{T}GC}\Rightarrow\texttt{CCACAT\underline{T}GC}\Rightarrow\texttt{CCACATTTGC}, $$

which contains tt as its substring.

Here is another substring example. For s=CATs'=\texttt{CAT},

$$s'=\texttt{\underline{CA}T}\Rightarrow\texttt{\underline{C}ACAT}\Rightarrow\texttt{CCACA\underline{T}}\Rightarrow\texttt{CCACA\underline{T}T}\Rightarrow\texttt{CCACATTT}=t $$

which shows that we can produce the target string from CAT\texttt{CAT} by a series of tandem copies.

It is easy to verify that the total number of valid substrings of ss is 1414. Note that both the first CAT\texttt{CAT} and the second CAT\texttt{CAT} in ss are counted as different valid substrings. Thus, you need to consider all substrings of ss and count all valid substrings individually.

Here is another example. When s=ABs=\texttt{AB} and t=BAt=\texttt{BA}, you can take the substring AB\texttt{AB} and tandem copy AB\texttt{AB}. Then, the resulting string is ABAB\texttt{ABAB}, which contains BA\texttt{BA} as its substring. All other substrings of 𝑠𝑠 are unable to produce BA\texttt{BA} as a substring, and therefore the number of valid substrings is one.

Given a source string ss and a target string tt, where no two consecutive characters in ss are the same character, write a program that outputs the number of valid substrings ss' of ss. ss' is a valid substring of ss if a series of tandem copies on ss' can produce a string that contains tt as its substring, where a tandem copy is restricted to at most two consecutive characters at each step.

输入格式

Your program is to read from standard input. The input consists of two lines. The first line is the source string ss, and the second line is the target string tt. Each input consists of uppercase letters A to Z, and 1s,t2×1041\le |s|, |t| \le 2 \times 10^4.

输出格式

Your program is to write to standard output. Print exactly one line. The line should print the number of valid substrings in which a series of tandem copies can produce a string that contains the target string as its substring.

ACGCG
CCG
9
TATCGC
TTCCG
6
ABCABC
ABC
7
ABCABC
ABCABC
1