## Sunday, August 28, 2011

### SGU - 101 - Domino

This is another interesting problem.

Problem summery: You're given N (N <= 100) dominoes. Each of the dominoes has two numbers on it between 0 to 6, one in left side and another in right side. You have to arrange all the dominoes in a row so that they touch through equal marked side. If not possible print "No solution". Note that, we also have to check the graph is connected.

For example, we have three dominoes [1 2], [3, 2], [3 4]
If we arrange these dominoes like this -
[1 2] - [3 2] - [3, 4], this is invalid, because right side of first domino and left side of second domino marked with different numbers. But if we flip the second domino, to change left and right side, it will become -
[1 2] - [2 3] - [3 4], now every two adjacent dominoes touches with equal marked side. So it's a valid arrangement. So to organize it like this, we are allowed to rearrange dominoes and rotate any domino.

To solve this problem, we cannot try all possible arrangements because number of arrangements is too large. As the dominoes are marked with numbers between 0 to 6, we can consider we have a graph of 7 nodes named 0 to 6. If there is a domino [a b], we will consider there is an edge between a and b. Now we have a graph of at most 7 nodes and at most 100 edges. Our problem can be described as - is it possible to start traversing from one node and visit each of the edges exactly once?

Let's use the example from the problem. We have 5 dominoes [1 2], [2 4], [2 4], [6 4], [2 1]. We can build our graph like this - And what about the solution? See the following image, yes we can traverse all edges exactly once. The path we traversed is 4 -> 2 -> 1 -> 2 -> 4 -> 6. So, our domino arrangement will be, [4 2] - [2 1] - [1 2] - [2 4] - [4 6], if we write the indices of the dominoes this sequence is 2, 5, 1, 3, 4. Note that we flipped one [2 4] tile to [4 2] and [6 4] tile to [4 6]. We also need to output which domino we used as it is by a '+' sign, and which domino we flipped by a '-' sign. So our answer is -
2 -
5 +
1 +
3 +
4 -
Which is also given in the sample output.

Leonhard Euler proved that, we can visit every edge exactly once in a graph if number of nodes having odd degree is not more than two. If number of odd degree in the graph is two, we have to start from one of them and end our tour in another of them. By the name of Euler, this path is called Eulerian trail or Eulerian walk. For example in our graph in the image above, node 4 and 6 have odd degree, so we started at 4 and ended at 6.

So, if the number of odd degree of the graph is more than two, we will output "No solution", otherwise we will try to print any one of the solution, by finding out the Eulerian walk. But, what will be the strategy of our Eulerian walk. We can use the strategy of Fleury's algorithm. The strategy is, start with an odd degree node if there exists any, otherwise start with any node. At each iteration we visit the edge whose deletion would not disconnect the graph.

Now, as we have n edge, so we will remove one edge at each step of our algorithm. We'll try 7 different edges and after removing that edge we'll check whether the graph is connected nor not using a dfs, in our case, which is O(n). So our algorithm is somewhat like O(n * n).

My simple C++ solution -
#include <iostream>#include <cstdio>#include <vector>#include <cstring>using namespace std;bool vis;vector<int> edge_index;int degree;// can reach v from u?bool dfs(int u, int v) {    if (u == v) {        return true;    }    vis[u] = true;    for (int i = 0; i <= 6; ++i) {        if ((edge_index[u][i].size() > 0 || edge_index[i][u].size() > 0)                && vis[i] == false) {            if (dfs(i, v)) {                return true;            }        }    }    return false;}bool is_connected() {    memset(vis, 0, sizeof(vis));    for (int i = 0; i <= 6; ++i) {        if (degree[i]) {            dfs(i, -1);            break;        }    }        for (int i = 0; i <= 6; ++i) {        if (degree[i] > 0 && vis[i] == false) {            return false;        }    }    return true;}int main (void) {    int n; // number of dominoes i.e. number of edges    scanf("%d", &n);    memset(degree, 0, sizeof(degree));    for (int i = 0; i <= 6; ++i) {        for (int j = 0; j <= 6; ++j) {            edge_index[i][j].clear();        }    }    for (int i = 1; i <= n; ++i) {        int u, v;        scanf("%d%d", &u, &v);        edge_index[u][v].push_back(i);        degree[u]++;        degree[v]++;    }    int n_odd = 0; // number of nodes with odd degree    for (int i = 0; i <= 6; ++i) {        if (degree[i] % 2 == 1) {            n_odd++;        }    }    if (n_odd > 2 || is_connected() == false) {        // if number of odd degree node is more than 2        // eulerian path doesn't exist for this graph        printf("No solution\n");    } else {        int u = -1; // starting node        // start node with an odd degree node, if exists        for (int i = 0; i <= 6; ++i) {            if (degree[i] % 2 == 1) {                u = i;                break;            }        }        // if odd degree node not found, start with a node        // with positive degree, i.e. start with any vertex,        // which is used in the graph        if (u == -1) {            for (int i = 0; i <= 6; ++i) {                if (degree[i] > 0) {                    u = i;                    break;                }            }        }        // Fleury's algorithm        // n steps - at each step, remove one edge        for (int i = 0; i < n; ++i) {            int selected_ind;            bool forward_selected;            for (int j = 0; j <= 6; ++j) {                if (edge_index[u][j].size() > 0 ||                    edge_index[j][u].size() > 0) {                    int cur_ind;                    bool forward;                    if (edge_index[u][j].size() > 0) {                        cur_ind = edge_index[u][j].back();                        edge_index[u][j].pop_back();                        forward = true;                    } else {                        cur_ind = edge_index[j][u].back();                        edge_index[j][u].pop_back();                        forward = false;                    }                    degree[u]--;                    degree[j]--;                    memset(vis, 0, sizeof(vis));                        if (degree[u] == 0 || dfs(u, j) == true) {                        // remove this edge                        selected_ind = cur_ind;                        forward_selected = forward;                        u = j; // so we are in j now.                        break;                    } else {                        // this edge cannot be removed                        // back to previous state                        if (forward) {                            edge_index[u][j].push_back(cur_ind);                        } else {                            edge_index[j][u].push_back(cur_ind);                        }                        degree[u]++;                        degree[j]++;                    }                }            }            printf("%d %c\n", selected_ind, forward_selected ? '+' : '-');        }    }    return 0;}

## Saturday, August 20, 2011

### Maximum Triangle Area (Part 3 and final)

Part 1
Part 2

After second part, our algorithm became something like this -
for each 0 <= i < n, initialize A = Pi, B = P(i + 1) % n, C = P(i + 2) % n, then move C counter clockwise until area non-decreasing, move B to its adjacent point counter clockwise, if area non-decreasing, move C again until area non decreasing and so on. You will get the maximum area for each A and output the maximum of these areas. For a given point starting point Pi, if the maximum area is ABC. We can say ABC is 'stable' triangle for A, in other word s(pi, Pi + 1, Pi + 2) = ABC, that means we cannot move B or C any further by not decreasing the area.

Now, after getting a maximum ABC for a fixed first point Pi, we will start with Pi + 1, Pi + 2 and Pi + 3. We get A'B'C' as a stable triangle i.e. s(pi, pi + 1, pi + 2) = A'B'C'. Hence our algorithm is O(n ^ 2). But it can be proved that, rather than, start from Pi + 1 as a second point and Pi + 2 as a third point, we can start with B and C as a second and third point and get the same result, i.e. s(Pi + 1, Pi + 2, Pi + 3) = s(Pi + 1, B, C), where s(Pi, Pi + 1, Pi + 2) = ABC.

Lets assume that we are wrong, and s(Pi + 1, Pi + 2, Pi + 3) = A'B'C' > s(pi, Pi + 1, Pi + 2).

Case 1:
B' < B and C' = C We know that,
ABC >= AB'C (otherwise largest triangle for A would be AB'C)
=> ABC + BB'A >= AB'C + BB'A
=> AB'BC >= AB'C + BB'A
=> AB'C + BB'C >= AB'C + BB'A
=> BB'C >= BB'A
=> BB'C >= BB'A' (for base BB' area is non decreasing from BB'C to BB'A, so it
will be non decreasing from BB'A to BB'C)
=> BB'C + A'B'C >= BB'A' + A'B'C
=> A'B'BC >= BB'A' + A'B'C
=> A'BC + BB'A' >= BB'A' + A'B'C
=> A'BC >= A'B'C
=> A'BC >= A'B'C' ......  (as C = C')

So, for case 1, s(Pi + 1, Pi + 2, Pi + 3) = s(Pi + 1, B, C).

Case 2:
Let's say, A'B'C' is the largest triangle for first point A'.
Where, B' < B and C' > C. See the following image. For first point A, ABC is the largest trianlge,
So, ABC >= AB'C
=> ABC + BB'C >= ABC + BB'C
=> ABC + BB'C >= AB'BC
=> ABC + BB'C >= ABC + BB'A
=> BB'C >= BB'A
i.e. BB'A <= BB'C Hence, BB'A <= BB'C' (otherwise, BB'A > BB'C and BB'A <= BB'C, which is not possible for a convex polygon) So, BB'A' <= BB'C' ...  (as the triangle area non increasing by moving C' to A, it will be non increasing from A to A') Now as we are considering, for first point A', the largest area triangle is A'B'C'. So, A'B'C' > A'BC'
=> A'B'C' + BB'A' > A'BC' + BB'A
=> A'B'C' + BB'A' > A'B'BC'
=> A'B'C' + BB'A' > A'B'C' + BB'C'
=> BB'A' > BB'C', this contradicts to .

So it is not possible to have, A'B'C' > A'BC' i.e. for case 2, s(Pi + 1, Pi + 2, Pi + 3) = s(Pi + 1, B, C).

Case 3:
Let's say, A'B'C' is the largest triangle for fixed first point A', where,
B' < B and C' < C As ABC is the largest triangle having first point A,
either ABC' >= AB'C' (by moving B' to B, area increases) .. (3A)
or AB'C >= AB'C' (by moving C' to C, area increases) .. (3B)

Case 3A
ABC' >= AB'C'
=> ABC' + BB'A >= AB'C' + BB'A
=> AB'BC' >= AB'C' + BB'A
=> AB'C' + BB'C' >= AB'C' + BB'A
=> BB'C' >= BB'A
=> BB'C' >= BB'A' (As by moving C' to A area non increasing, moving A to A' will area be non increasing)
=> BB'C + A'B'C' >= BB'A' + A'B'C'
=> BB'A'C' >= BB'A' + A'B'C'
=> BB'A' + A'BC' >= BB'A' + A'B'C'
=> A'BC >= A'B'C'

So it is not possible to have, A'B'C' > A'BC' i.e. for case 3A, s(Pi + 1, Pi + 2, Pi + 3) = s(Pi + 1, B, C).

Case 3B
AB'C >= AB'C'
=> AB'C + CC'B' >= AB'C' + CC'B'
=> AB'C'C >= AB'C' + CC'B'
=> AB'C' + CC'A >= AB'C' + CC'B'
=> CC'A >= CC'B'
=> CC'A' >= CC'B' (As moving A to B', area decreases, it's not possible moving A' to B' area increases)
=> CC'A' + A'B'C' >= CC'B' + A'B'C'
=> A'B'C'C >= CC'B' + A'B'C'
=> CC'B' + A'B'C >= CC'B' + A'B'C'
=> A'B'C >= A'B'C'

So it is not possible to have, A'B'C' > A'B'C i.e. for case 3B, s(Pi + 1, Pi + 2, Pi + 3) = s(Pi + 1, B, C).

Case 4:
C' < C and B' = B Here,
ABC >= ABC'
=> ABC + CC'A >= ABC' + CC'A
=> ABC + CC'A >= ABC'C
=> ABC + CC'A >= ABC + CC'B
=> CC'A >= CC'B
=> CC'A' >= CC'B
=> CC'A' + A'BC' >= CC'B + A'BC'
=> A'BC'C >= CC'B + A'BC'
=> A'BC + CC'B >= CC'B + A'BC'
=> A'BC >= A'BC'
=> A'BC >= A'B'C' (as B' = B)

So it is not possible to have, A'B'C' > A'BC i.e. for case 3B, s(Pi + 1, Pi + 2, Pi + 3) = s(Pi + 1, B, C).

Case 5:
B' > B and C' < C Now,
A'BC >= A'BC' (From case 4)
=> A'BC + CC'B >= A'BC' + CC'B
=> A'BC'C >= A'BC' + CC'B
=> A'BC' + CC'A' >= A'BC' + CC'B
=> CC'A' >= CC'B
=> CC'A' >= CC'B'
=> CC'A' + A'B'C' >= CC'B' + A'B'C'
=> CC'B'A' >= CC'B' + A'B'C'
=> CC'B' + A'B'C >= CC'B' + A'B'C'
=> A'B'C >= A'B'C'

So it is not possible to have, A'B'C' > A'B'C i.e. for case 5, s(Pi + 1, Pi + 2, Pi + 3) = s(Pi + 1, B, C).

So, from all of the 5 possible cases, we found that, s(Pi + 1, Pi + 2, Pi + 3) = s(Pi + 1, B, C). So our algorithm will still work, if we start our iteration for first point i = i + 1, without changing the position 2nd and 3rd point. We will move first point from 0 to n - 1, and 2nd and 3rd point will also not move more than n times. So our algorithm in worst case is O (3 * n) = O(n)

My simple implementation in C++.

#include<cstdio>#include<cstring>#include<cmath>#include<cstdlib>#include<vector>#include<string>#include<algorithm>#include<set>#include<map>#include<cassert>#include<sstream>#include<queue>#include<stack>#include<iostream>using namespace std;struct Point {    int x;    int y;    Point(int _x, int _y) : x(_x), y(_y) {}    Point() : x(0), y(0) {}    bool operator <(const Point& p) const {        return x < p.x || (x == p.x && y < p.y);    }};bool left_turn(const Point& p1, const Point& p2, const Point& p3) {    return (p2.y - p1.y) * (p3.x - p1.x) - (p2.x - p1.x) * (p3.y - p1.y) > 0;}// Returns list of points of convex hull in counter clockwise// The last point and first point will be equalvector<Point> convex_hull(vector<Point> p) {    vector<Point> ret;    sort(p.begin(), p.end());    // build lower hull    for (int i = 0; i < p.size(); ++i) {        while (ret.size() >= 2 &&               left_turn(ret[ret.size() - 2], ret[ret.size() - 1], p[i])) {            ret.pop_back();        }        ret.push_back(p[i]);    }    int lower_hull_size = ret.size();    // build upper hull    for (int i = p.size() - 2; i >= 0; --i) {        while (ret.size() - lower_hull_size >= 1 &&               left_turn(ret[ret.size() - 2], ret[ret.size() - 1], p[i])) {            ret.pop_back();        }        ret.push_back(p[i]);    }    return ret;}double triangle_area (const Point& p1, const Point& p2, const Point& p3) {    return abs((double) p1.x * p2.y +               (double) p2.x * p3.y +               (double) p3.x * p1.y -               (double) p1.y * p2.x -               (double) p2.y * p3.x -               (double) p3.y * p1.x) / 2.0;}int main (void){    int n;    while (scanf("%d", &n) && n != -1) {        vector<Point> p;        for (int i = 0; i < n; ++i) {            int x, y;            scanf("%d%d", &x, &y);            p.push_back(Point(x, y));        }        p = convex_hull(p); p.pop_back();        int i = 0;        int j = i + 1;        int k = j + 1;        double res = 0.;        while (true) {            double area = triangle_area(p[i], p[j], p[k]);            while (true) {                while (true) {                    int nk = (k + 1) % n;                    double narea = triangle_area(p[i], p[j], p[nk]);                    if (narea >= area) {                        k = nk;                        area = narea;                    } else {                        break;                    }                }                int nj = (j + 1) % n;                double narea = triangle_area(p[i], p[nj], p[k]);                if (narea >= area) {                    j = nj;                    area = narea;                } else {                    break;                }            }            if (area > res) res = area;            i++;            if (i == j) j = (j + 1) % n;            if (j == k) k = (k + 1) % n;            if (i == n) break;        }        printf("%.2lf\n", res);    }    return 0;}

## Thursday, August 18, 2011

### Maximum Triangle Area (Part 2)

I hope you enjoyed the part 1. In part - 1. We came up to an algorithm of complexity O(n ^ 2 log n). We'll try to improve it now.

Our problem is now reduced to: Given a convex polygon having n points. Find three points from these n points which form the largest triangle area.

Let's consider our points are ordered in counter clockwise and named P0, P1, ... , Pn - 1.

Let's start with A = P0, B = P1, C = P2. Now we will be moving C counter clockwise order (i.e. P3, P4, ...), as long as the triangle area doesn't decrease. And we know(from part 1), the area will not increase anymore after this point. So let's stop moving C. Let's say, we stopped C at Pk. So we found the largest triangle, having A = P0, and B = P1. Now it's time to move B. Let's move B to the adjacent point clockwise, if the area doesn't decreases. Now for new AB we have to find the best position of C, so that area is maximized. We will find that, we don't need to iterate from the next point of B, rather we can proceed with C as where it is now and try to move it counter clockwise as long the area doesn't decrease. Will it work always? Let's consider that it will not work, i.e. There will be a case where for a given AB, we will find maximum triangle ABC, and after moving B to B', we will find maximum triangle AB'C', where C' comes before C (By C' comes before C, I mean if we count points of the polygon starting from A, C' will come before C), and AB'C' > ABC, as the following picture. In the image above, we got ABC as the largest triangle for base AB. And after moving B to B', we got the largest triangle AB'C', where C' comes before C.
we know that, ABC >= ABC' ...  (As ABC is the largest triangle for base AB)
Now,
ABC' + ACC' = ABC'C = ABC + BCC'
=> ACC' >= BCC'  (as ABC >= ABC' from )

From (2), we can say, ACC' >= B'CC' (We already told for a given base, AB, if we move another point C, it will increase area of ABC up to a certain point and than start decreasing, i.e. it will never increase again after that point. Here for base CC', it started decrease from point A to B, so it will not increase from B to B'. As ACC' >= BCC', so ACC' >= B'CC').

ACC' >= B'CC'
=> ACC' + AB'C' >= B'CC' + AB'C' (adding AB'C' in both side)
=> AB'C'C >= B'CC' + AB'C'
=> AB'C + B'CC' >= B'CC' + AB'C'
=> AB'C >= AB'C'

So, for base AB' any point C' before C will not increase the area. So we can safely iterate from current position of C to find a larger area triangle. After finding the best position of C, move B again, if it increases the area. If moving B doesn't increases the area, stop moving B. Question is, is there any chance of getting larger triangle don't stop moving B in this case? Let's consider that there is a chance to get a larger area. For example, in the above picture, for base AB we found ABC as the largest triangle, but after moving B to B' we found AB'C < ABC. Now for base AC, we already know that there area will not increase any more by moving B. And we also proved that area will not increase moving C in clockwise. So only way left is moving C counter clockwise. Let's say we found a solution AB'C', where C' comes after C. As this is a better solution, AB'C' > AB'C

Now, ABC > ABC' (If ABC' >= ABC, we would have already moved C to C' for base AB).
=> ABC + CC'A > ABC' + CC'A
=> ABCC' > ABC' + CC'A
=> ABC' + CC'B > ABC' + CC'A
=> CC'B > CC'A 

And, AB'C' > AB'C
=> AB'C' + CC'B' > AB'C + CC'B
=> AB'CC' > AB'C + CC'B
=> AB'C + CC'A > AB'C + CC'B
=> CC'A > CC'B 

 and  are contradictory. So if after moving B to B', the area decreases, it will not possible to get larger area for the same A. So we have to stop moving B, and this ABC is the largest triangle for a given A. As B can move up to n times and also C can move up to n times. So for a given A, our algorithm complexity is O(2n) ~ O(n). We have to do this for each possible position of A. So our algorithm becomes O(n ^ 2).

Hopefully you enjoyed it so far. We'll try to improve more in part 3.

## Wednesday, August 17, 2011

### Maximum Triangle Area (Part 1)

Today I'll discuss about another interesting problem. Though the analysis is big, hopefully you'll find it interesting.

Problem name: Maximum Triangle Area
Problem description:
Given n distinct points on a plane, your task is to ﬁnd the triangle that have the maximum area, whose vertices are from the given points.

The most obvious solution is trying all possible triangle and find out the maximum area. The complexity of that algorithm is O(n^3). Which is huge as n can be as large as 50000.

So what can we do now? An interesting fact is, we can solve this problem by only considering the points of the convex hull for the given set of points. This will work because, if there is a triangle which has a vertex A which is not in the boundary of the convex hull, there will be always a point X in convex hull such that, by moving A to X we can increase the area of the triangle. Let's see an example - We have a set of points {A, B, C, D, E, F, G, H, I}. The convex hull of these points consists of {D, E, F, G, H, I}. But let's claim that the largest triangle using these points is ABC. None of it's vertex is in convex hull. Let's say, BC is the base of this triangle and AP (perpendicular distance from A to BC) is the height of the triangle. So, the area of the triangle is 0.5 * BC * AP. Now draw a line parallel to the BC which goes through A. As A is inside the convex hull, you will find at least one vertex from the convex hull which is on the other side of the parallel line. Here in our example, we found D and BC are in the opposite side of the parallel line, so perpendicular distance from D to BC (DQ) is greater than AP. Hence, 0.5 * BC * AB < 0.5 * BC * DQ. So, triangle area DBC > triangle area ABC. See the following picture to make it clear - Now, we got a triangle DBC which has larger area than ABC, but yet we have two vertices in it which are not in the convex hull. Now in the similar way considering DB as a base, and drawing a parallel line of DB which goes through C, we will find that area of DBF is greater than area of DBC. In the next step we can easily show again, that area of DBH is greater than the area of DBF. So we found that, maximum area triangle can be found using only convex hull vertices. But if all of the given points is in the convex hull, our algorithm is still O(n ^ 3). Using the convex hull property we can improve it. Now our problem is finding the largest triangle using the vertices of a convex polygon. Let's say we have n vertices 0 to n - 1. We will try to find out three vertices A, B, C in clockwise direction from these vertices which will form the maximum triangle.

Now for a given AB, rather than iterating all of the points between B and A, to find the C, we can use ternary search to find C, as if we consider AB as a base, and iterate from the next point of B one by one, the height (of the triangle ABC) will increase in every step up to a certain vertex, and after that height will decrease in every step. This way, we can ternary search the point C for every possible AB, and the complexity will become (n ^ 2 log n), which is still not enough.

As, the analysis becoming large and large, we'll discuss the other improvements in the next part.

(Special thanks to Anna Fariha for helping me understanding some proofs of this analysis.)

## Sunday, August 7, 2011

### Perfect Memory

Problem name: Perfect Memory
Problem source: Topcoder, SRM 513, Div 1 - 500

In this problem you have a board having N * M cells. Each cell has a symbol in it's back. Each symbol is in exactly two cells in the board. So there will be (N * M) / 2 symbols. In each move you can turn two cells one by one to see the symbols behind them. If these two symbols differ, both the cell turned back to hide the symbols again. If both the symbols are same, they will remain same and never turn back again. What is the expected number of moves to make all the symbols visible, if you have the perfect memory?

Note that, having perfect memory, you'll never forget what was the symbol in a cell which you saw, but turned back. In this problem it's guaranteed that N * M will be even.

I couldn't solve this problem in real contest. I thought in each turn both of the cells will be turned simultaneously. But in the problem statement it was clearly mentioned that, two cells will be turned one by one. It seems that I am not the only one who got this wrong.

This problem can be solved by dynamic programming. Our state will be (total, seen, turn). Here,
total = number of symbols for which both cells are not seen yet i.e. at least one cell is unseen. So we are actually considering (2 * total) cells.
seen = number of symbols partially seen i.e. number of cells of which exactly one cell is seen.
turn = Is this the first turn or second turn of the move? We know each move consists of two turns.

Initially, total = (N * M) / 2, seen = 0, and turn = FIRST.

Boundary condition: If total = 0, we don't have any symbol left to consider, so expected number will be zero.

In state (total, seen, turn), we have (total * 2 - seen) cells we didn't see yet. Lets say unseen = total * 2 - seen.

Now if the turn = FIRST, here we will always open an unseen cell and two things can happen after turning a cell on.
1. With p1 probability the cell contains a symbol which is already seen in another cell.
2. With p2 probability the cell contains a new symbol which is not visible before.

Here,
p1 = seen / unseen
p2 = 1.0 - p1

If (1) happens, we will just turn on the other symbol we already saw, and both will remain open forever. So our new state will be (total - 1, seen - 1, FIRST) and one move will be completed.
If (2) happens, new state will be (total, seen + 1, SECOND)

exp_move(total, seen, FIRST) = p1 * (1 + exp_move(total - 1, seen - 1, FIRST)) +
p2 * (exp_move(total, seen + 1, SECOND)

Note that, in first option we completed a move, so added +1, but in second option move is not done yet.

If turn = SECOND, we'll open another unseen cell. Three things can happen
1. With probability p1, symbol in the SECOND turn matches with the symbol of the FIRST turn.
Here p1 = 1 / unseen
2. With probability p2, symbol in the SECOND move matches with a symbol which is previously seen, but not seen in the first turn of this move. So,
p2 = seen / unseen - p1 = (seen - 1) / unseen
3. With probability p3, symbol in the SECOND move is completely a new symbol.
p3 = 1 - p2 - p3

If (1) happens, this pair will remain visible, so our state will be (total - 1, seen - 1, FIRST) and one move will be completed.
If (2) happens, this pair will be turned back, but we know both cells of the symbol of last turn. So we'll open both symbol in next move, so two moves will be completed in this case and new state will be (total - 1, seen - 1, FIRST).
If (3) happens, just another seen added. So state will be (total, seen + 1, FIRST), and one move will be completed.

exp_move(total, seen, SECOND) = p1 * ((exp_move(total - 1, seen - 1, FIRST) + 1) +
p2 * ((exp_move(total - 1, seen - 1, FIRST) + 2) +
p3 * ((exp_move(total, seen + 1, FIRST) + 1)

My solution in C++:

#include <cstdio>#include <cmath>#include <cstdlib>#include <cctype>#include <algorithm>#include <map>#include <string>#include <vector>#include <iostream>#include <sstream>#include <queue>#include <cstring>#include <set>#include <ctime>#include <cfloat>using namespace std;typedef long long i64;#define I64 "%lld"#define rep(i,n) for (i=0; i < (n); ++i)#define all(c) (c).begin(),(c).end() bool done;double memo;enum{FIRST = 0, SECOND};double exp_move(int total, int seen, int turn) {    double &ret = memo[total][seen][turn];    if (done[total][seen][turn]) return ret;    done[total][seen][turn] = true;    int unseen = total * 2 - seen;    if (total == 0) return ret = 0;    ret = 0.;        // open one unseen cell randomly    if (turn == FIRST) {        double p1 = 0, p2 = 0;        // Probability of it's pair is already seen is p1        if (seen > 0) {            p1 = (double) seen / (double) unseen;            ret += p1 * (exp_move(total - 1, seen - 1, FIRST) + 1);        }        // Probability of it's pair is not seen yet is p2        if (unseen > seen) {            p2 = 1.0 - p1;            ret += p2 * (exp_move(total, seen + 1, SECOND));        }    }    else {        double p1 = 0, p2 = 0, p3 = 0;        // Probability of it's pair is already open is p1        p1 = (double) 1 / (double) unseen;        ret += p1 * (exp_move(total - 1, seen - 1, FIRST) + 1);        // Probability of it's pair is seen but not open is p2        if (seen > 1) {            p2 = (double) (seen - 1) / (double) unseen;            ret += p2 * (exp_move(total - 1, seen - 1, FIRST) + 2);        }        // Probability of it's pair is unseen is p3        if (unseen > seen) {            p3 = 1.0 - p1 - p2;            ret += p3 * (exp_move(total, seen + 1, FIRST) + 1);        }    }    return ret;}class PerfectMemory {    public:    double getExpectation(int N, int M) {        return exp_move((N * M) / 2, 0, FIRST);    }};