Porting Lucene: Iteration 1 – Decisions about how to start porting…

So time to make a list of files to port. Lets see how many java files there are…

find lucene |grep ".java" |wc -l
5502

Hmm…. OK let’s try focusing on Test cases instead…

find lucene |grep ".java" |grep Test |wc -l
1570

A bit better but not great. Let’s focus on the first thing you need to create an index…

find lucene/core/src/test/org/apache/lucene/store |grep ".java" |grep Test |wc -l
25

Looking good….Oh wait…. and then there are dependencies

 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
package org.apache.lucene.store;

import static org.junit.Assert.*;

import com.carrotsearch.randomizedtesting.RandomizedTest;
import com.carrotsearch.randomizedtesting.Xoroshiro128PlusRandom;
import com.carrotsearch.randomizedtesting.generators.RandomBytes;
import com.carrotsearch.randomizedtesting.generators.RandomNumbers;
import com.carrotsearch.randomizedtesting.generators.RandomPicks;
import com.carrotsearch.randomizedtesting.generators.RandomStrings;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.util.ArrayList;
import java.util.List;
import java.util.Random;
import org.apache.lucene.util.ArrayUtil;
import org.apache.lucene.util.IOUtils.IOConsumer;
import org.junit.Test;

public abstract class BaseDataOutputTestCase extends RandomizedTest {
protected abstract T newInstance();

So org.apache.lucene.util is too big to implement at once.

find lucene/core/src/test/org/apache/lucene/util/ |grep ".java" |grep Test |wc -l
     100

So in theory implementing the rest of core should result in the util package getting fully implemented. Now Lucene also has it’s own Test Framework.

Test Frameworks

So here is the conundrum. The problem with JNI is you are living in two domains (JavaVM & System).It is very tempting to use the Java test cases as a source of truth for the Rust code behavior. The key problem is when it goes wrong is the problem in the java code, JNI, or the rust code? The alternative is to have two sets of books. port the tests to rust and run both. This essentially doubles the amount of work. However should save an incredible amount of time when tests fail. The resulting logic should be

Rust test failsJUnit test failsNext steps
YesYesFocus on changing the code to make the Rust test pass.
YesNoUpdate the Rust tests to match the logic of the JUnit Test
NoYesEnsure the logic in the rust test match the JUnit test. If so focus on the JNI glue code.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s