Java:多维数组与一维数组

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2512082/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-13 08:40:31  来源:igfitidea点击:

Java: Multi-dimensional array vs. one-dimensional

javaarraysmultidimensional-array

提问by Mikolan

For example:

例如:

  • a)int [x][y][z]

    vs

  • b)int[x*y*z]

  • 一种)int [x][y][z]

    对比

  • b)int[x*y*z]

Initially thought I'd go with a) for simplicity.

最初以为我会选择 a) 为简单起见。

I know that Java doesn't store arrays linearly in memory like C does, but what implications does this have for my program?

我知道 Java 不像 C 那样在内存中线性存储数组,但这对我的程序有什么影响?

采纳答案by Hyman

Usually the best thing to do when searching anwers for such questions is to see how the choices are compiled into JVM bytecode:

通常,在搜索此类问题的答案时,最好的做法是查看这些选项是如何编译成 JVM 字节码的:

multi = new int[50][50];
single = new int[2500];

This is translated into:

这被翻译成:

BIPUSH 50
BIPUSH 50
MULTIANEWARRAY int[][] 2
ASTORE 1
SIPUSH 2500
NEWARRAY T_INT
ASTORE 2

So, as you can see, the JVM already knows that we are talking about a multi dimensional array.

因此,如您所见,JVM 已经知道我们在谈论多维数组。

Keeping it further:

进一步保持:

for (int i = 0; i < 50; ++i)
    for (int j = 0; j < 50; ++j)
    {
        multi[i][j] = 20;
        single[i*50+j] = 20;
    }

This is translated (skipping the cycles) into:

这被翻译(跳过循环)为:

ALOAD 1: multi
ILOAD 3: i
AALOAD
ILOAD 4: j
BIPUSH 20
IASTORE

ALOAD 2: single
ILOAD 3: i
BIPUSH 50
IMUL
ILOAD 4: j
IADD
BIPUSH 20
IASTORE

So, as you can see, the multi-dimensional array is treated internally in the VM, no overhead generated by useless instructions, while using a single one uses more instructions since offset is calculated by hand.

因此,如您所见,多维数组在 VM 内部进行处理,没有无用指令产生的开销,而使用单个会使用更多指令,因为偏移量是手动计算的。

I don't think that performance will be such an issue.

我不认为性能会成为这样的问题。

EDIT:

编辑:

I did some simple benchmarks to see what's going down here. I chose to try different examples: linear read, linear write, and random access. Times are expressed in millisecs (and calculated using System.nanoTime(). Here are the results:

我做了一些简单的基准测试,看看这里发生了什么。我选择尝试不同的示例:线性读取、线性写入和随机访问。时间以毫秒表示(并使用 计算System.nanoTime()。以下是结果:

Linear write

线性写入

  • Size: 100x100 (10000)
    • Multi: 5.786591
    • Single: 6.131748
  • Size: 200x200 (40000)
    • Multi: 1.216366
    • Single: 0.782041
  • Size: 500x500 (250000)
    • Multi: 7.177029
    • Single: 3.667017
  • Size: 1000x1000 (1000000)
    • Multi: 30.508131
    • Single: 18.064592
  • Size: 2000x2000 (4000000)
    • Multi: 185.3548
    • Single: 155.590313
  • Size: 5000x5000 (25000000)
    • Multi: 955.5299
    • Single: 923.264417
  • Size: 10000x10000 (100000000)
    • Multi: 4084.798753
    • Single: 4015.448829
  • 尺寸:100x100 (10000)
    • 多:5.786591
    • 单身:6.131748
  • 尺寸:200x200 (40000)
    • 多:1.216366
    • 单:0.782041
  • 尺寸:500x500 (250000)
    • 多:7.177029
    • 单身:3.667017
  • 尺寸:1000x1000 (1000000)
    • 多:30.508131
    • 单身:18.064592
  • 尺寸:2000x2000 (4000000)
    • 多:185.3548
    • 单身:155.590313
  • 尺寸:5000x5000 (25000000)
    • 多:955.5299
    • 单身:923.264417
  • 尺寸:10000x10000(100000000)
    • 多:4084.798753
    • 单身:4015.448829

Linear read

线性读取

  • Size: 100x100 (10000)
    • Multi: 5.241338
    • Single: 5.135957
  • Size: 200x200 (40000)
    • Multi: 0.080209
    • Single: 0.044371
  • Size: 500x500 (250000)
    • Multi: 0.088742
    • Single: 0.084476
  • Size: 1000x1000 (1000000)
    • Multi: 0.232095
    • Single: 0.167671
  • Size: 2000x2000 (4000000)
    • Multi: 0.481683
    • Single: 0.33321
  • Size: 5000x5000 (25000000)
    • Multi: 1.222339
    • Single: 0.828118
  • Size: 10000x10000 (100000000)
    • Multi: 2.496302
    • Single: 1.650691
  • 尺寸:100x100 (10000)
    • 多:5.241338
    • 单身:5.135957
  • 尺寸:200x200 (40000)
    • 多:0.080209
    • 单:0.044371
  • 尺寸:500x500 (250000)
    • 多:0.088742
    • 单:0.084476
  • 尺寸:1000x1000 (1000000)
    • 多:0.232095
    • 单:0.167671
  • 尺寸:2000x2000 (4000000)
    • 多:0.481683
    • 单:0.33321
  • 尺寸:5000x5000 (25000000)
    • 多:1.222339
    • 单:0.828118
  • 尺寸:10000x10000(100000000)
    • 多:2.496302
    • 单身:1.650691

Random read

随机读取

  • Size: 100x100 (10000)
    • Multi: 22.317393
    • Single: 8.546134
  • Size: 200x200 (40000)
    • Multi: 32.287669
    • Single: 11.022383
  • Size: 500x500 (250000)
    • Multi: 189.542751
    • Single: 68.181343
  • Size: 1000x1000 (1000000)
    • Multi: 1124.78609
    • Single: 272.235584
  • Size: 2000x2000 (4000000)
    • Multi: 6814.477101
    • Single: 1091.998395
  • Size: 5000x5000 (25000000)
    • Multi: 50051.306239
    • Single: 7028.422262
  • 尺寸:100x100 (10000)
    • 多:22.317393
    • 单身:8.546134
  • 尺寸:200x200 (40000)
    • 多:32.287669
    • 单身:11.022383
  • 尺寸:500x500 (250000)
    • 多:189.542751
    • 单身:68.181343
  • 尺寸:1000x1000 (1000000)
    • 多:1124.78609
    • 单身:272.235584
  • 尺寸:2000x2000 (4000000)
    • 多:6814.477101
    • 单身:1091.998395
  • 尺寸:5000x5000 (25000000)
    • 多:50051.306239
    • 单身:7028.422262

The random one is a little misleading since it generates 2 random numbers for multi-dimensional array while just one for single dimensional (and PNRGs may consume some CPU).

随机数有点误导,因为它为多维数组生成 2 个随机数,而为单维生成一个随机数(并且 PNRG 可能会消耗一些 CPU)。

Mind that I tried to let JIT work by benchmarking only after the 20th run of the same loop. For completeness my java VM is the following:

请注意,我仅在同一循环的第 20 次运行后才尝试通过基准测试来让 JIT 工作。为了完整起见,我的 Java VM 如下:

java version "1.6.0_17" Java(TM) SE Runtime Environment (build 1.6.0_17-b04) Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01, mixed mode)

java version "1.6.0_17" Java(TM) SE Runtime Environment (build 1.6.0_17-b04) Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01,混合模式)

回答by Brian Agnew

If you choose the latter route then you're going to have to perform arithmetic for every single array access. That's going to be a pain and error prone (unless you wrap it in a class providing this functionality).

如果您选择后一条路线,那么您将不得不为每个数组访问执行算术运算。这将是一件痛苦且容易出错的事情(除非您将其包装在提供此功能的类中)。

I don't believe that there's any (significant) optimisation in choosing your flat array (especially given the arithmetic taken to index into it). As always with optimisations, you would need to perform some measurements and determine if it's really worth it.

我不相信在选择平面数组时有任何(重要的)优化(特别是考虑到用于索引的算法)。与优化一样,您需要执行一些测量并确定它是否真的值得。

回答by Roman

Use first variant (3-dimensional) because it's easier for understanding and there are less chances to make some logical error (especially if you're using it for modeling 3-dimensional space)

使用第一个变体(3 维),因为它更容易理解并且犯一些逻辑错误的机会更少(特别是如果您使用它来建模 3 维空间)

回答by Esko Luontola

On current CPUs, non-cached memory access is hundreds of times slower than arithmetics (see this presentationand read What every programmer should know about memory). The a) option will result in about 3 memory lookups whereas the b) option will result in about 1 memory lookup. Also the CPU's prefetching algorithms might not work as well. So the b) option can be faster in some situations (it's a hot spot and the array does not fit into the CPU's cache). How much faster? - that will depend on the application.

在当前的 CPU 上,非缓存内存访问比算术慢数百倍(请参阅此演示文稿并阅读每个程序员应该了解的内存知识)。a) 选项将导致大约 3 次内存查找,而 b) 选项将导致大约 1 次内存查找。此外,CPU 的预取算法也可能无法正常工作。因此 b) 选项在某些情况下可能会更快(这是一个热点并且阵列不适合 CPU 的缓存)。快多少?- 这将取决于应用程序。

Personally I would first use the a) option, because it will result in simpler code. If a profiler shows that array access to be a bottleneck, then I would convert it to the b) option, so that there is a pair of helper methods for reading and writing array values (that way the messy code will be restricted to those two methods).

我个人会首先使用 a) 选项,因为它会产生更简单的代码。如果分析器显示数组访问是瓶颈,那么我会将其转换为 b) 选项,以便有一对用于读取和写入数组值的辅助方法(这样混乱的代码将仅限于这两个方法)。

I made a benchmark for comparing 3-dimensional int arrays ("Multi" column) to the equivalent 1-dimensional int arrays ("Single" column). The code is hereand tests here. I ran it on 64-bit jdk1.6.0_18, Windows 7 x64, Core 2 Quad Q6600 @ 3.0 GHz, 4 GB DDR2, using the JVM options -server -Xmx3G -verbose:gc -XX:+PrintCompilation(I have removed the debug output from the following results). The results were:

我做了一个基准来比较 3 维 int 数组(“多”列)与等效的一维 int 数组(“单”列)。代码在这里,测试在这里。我使用 JVM 选项在 64 位 jdk1.6.0_18、Windows 7 x64、Core 2 Quad Q6600 @ 3.0 GHz、4 GB DDR2 上运行它-server -Xmx3G -verbose:gc -XX:+PrintCompilation(我已从以下结果中删除了调试输出)。结果是:

Out of 20 repeats, the minimum time in milliseconds is reported.

Array dimensions: 100x100x100 (1000000)
            Multi   Single
Seq Write   1       1
Seq Read    1       1
Random Read 99      90    (of which generating random numbers 59 ms)

Array dimensions: 200x200x200 (8000000)
            Multi   Single
Seq Write   14      13
Seq Read    11      8
Random Read 1482    1239    (of which generating random numbers 474 ms)

Array dimensions: 300x300x300 (27000000)
            Multi   Single
Seq Write   53      46
Seq Read    34      24
Random Read 5915    4418    (of which generating random numbers 1557 ms)

Array dimensions: 400x400x400 (64000000)
            Multi   Single
Seq Write   123     111
Seq Read    71      55
Random Read 16326   11144    (of which generating random numbers 3693 ms)

This shows the 1-dimensional array to be faster. Though the differences are so small, that for 99% applications it won't be noticable.

这表明一维数组更快。尽管差异很小,但对于 99% 的应用程序来说,它不会很明显。

I did also some measurements to estimate the overhead of generating the random numbers in the Random Read benchmark by replacing preventOptimizingAway += array.get(x, y, z);with preventOptimizingAway += x * y * z;and added the measurements to the above results table by hand. Generating the random numbers takes 1/3 or less of the total time of the Random Read benchmark, so the memory access dominates the benchmark as expected. It would be interesting to repeat this benchmark with arrays of 4 and more dimensions. Probably it would make the speed difference bigger, because the multidimensional array's topmost levels will fit into the CPU's cache, and only the other levels will require a memory lookup.

我也做了一些测试,以评估通过更换产生的随机读取基准的随机数的开销preventOptimizingAway += array.get(x, y, z);preventOptimizingAway += x * y * z;手工,并且增加了测量,以上述结果表。生成随机数需要 Random Read 基准测试总时间的 1/3 或更少,因此内存访问按预期支配了基准测试。用 4 维或更多维的数组重复这个基准测试会很有趣。可能它会使速度差异更大,因为多维数组的最顶层将适合 CPU 的缓存,只有其他级别需要内存查找。